首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
To provide protection against viral infection and limit the uptake of mobile genetic elements, bacteria and archaea have evolved many diverse defence systems. The discovery and application of CRISPR-Cas adaptive immune systems has spurred recent interest in the identification and classification of new types of defence systems. Many new defence systems have recently been reported but there is a lack of accessible tools available to identify homologs of these systems in different genomes. Here, we report the Prokaryotic Antiviral Defence LOCator (PADLOC), a flexible and scalable open-source tool for defence system identification. With PADLOC, defence system genes are identified using HMM-based homologue searches, followed by validation of system completeness using gene presence/absence and synteny criteria specified by customisable system classifications. We show that PADLOC identifies defence systems with high accuracy and sensitivity. Our modular approach to organising the HMMs and system classifications allows additional defence systems to be easily integrated into the PADLOC database. To demonstrate application of PADLOC to biological questions, we used PADLOC to identify six new subtypes of known defence systems and a putative novel defence system comprised of a helicase, methylase and ATPase. PADLOC is available as a standalone package (https://github.com/padlocbio/padloc) and as a webserver (https://padloc.otago.ac.nz).  相似文献   

2.
3.
MicroRNAs (miRNAs) are a family of small, non-coding RNA species functioning as negative regulators of multiple target genes including tumour suppressor genes and oncogenes. Many miRNA gene loci are located within cancer-associated genomic regions. To identify potential new amplified oncogenic and/or deleted tumour suppressing miRNAs in lung cancer, we inferred miRNA gene dosage from high dimensional arrayCGH data. From miRBase v9.0 (http://microrna.sanger.ac.uk), 474 human miRNA genes were physically mapped to regions of chromosomal loss or gain identified from a high-resolution genome-wide arrayCGH study of 132 primary non-small cell lung cancers (NSCLCs) (a training set of 60 squamous cell carcinomas and 72 adenocarcinomas). MiRNAs were selected as candidates if their immediately flanking probes or host gene were deleted or amplified in at least 25% of primary tumours using both Analysis of Copy Errors algorithm and fold change (≥±1.2) analyses. Using these criteria, 97 miRNAs mapped to regions of aberrant copy number. Analysis of three independent published lung cancer arrayCGH datasets confirmed that 22 of these miRNA loci showed directionally concordant copy number variation. MiR-218, encoded on 4p15.31 and 5q35.1 within two host genes (SLIT2 and SLIT3), in a region of copy number loss, was selected as a priority candidate for follow-up as it is reported as underexpressed in lung cancer. We confirmed decreased expression of mature miR-218 and its host genes by qRT-PCR in 39 NSCLCs relative to normal lung tissue. This downregulation of miR-218 was found to be associated with a history of cigarette smoking, but not human papilloma virus. Thus, we show for the first time that putative lung cancer-associated miRNAs can be identified from genome-wide arrayCGH datasets using a bioinformatics mapping approach, and report that miR-218 is a strong candidate tumour suppressing miRNA potentially involved in lung cancer.  相似文献   

4.
The promise of personalized cancer medicine cannot be fulfilled until we gain better understanding of the connections between the genomic makeup of a patient''s tumor and its response to anticancer drugs. Several datasets that include both pharmacologic profiles of cancer cell lines as well as their genomic alterations have been recently developed and extensively analyzed. However, most analyses of these datasets assume that mutations in a gene will have the same consequences regardless of their location. While this assumption might be correct in some cases, such analyses may miss subtler, yet still relevant, effects mediated by mutations in specific protein regions. Here we study such perturbations by separating effects of mutations in different protein functional regions (PFRs), including protein domains and intrinsically disordered regions. Using this approach, we have been able to identify 171 novel associations between mutations in specific PFRs and changes in the activity of 24 drugs that couldn''t be recovered by traditional gene-centric analyses. Our results demonstrate how focusing on individual protein regions can provide novel insights into the mechanisms underlying the drug sensitivity of cancer cell lines. Moreover, while these new correlations are identified using only data from cancer cell lines, we have been able to validate some of our predictions using data from actual cancer patients. Our findings highlight how gene-centric experiments (such as systematic knock-out or silencing of individual genes) are missing relevant effects mediated by perturbations of specific protein regions. All the associations described here are available from http://www.cancer3d.org.  相似文献   

5.
Two dimensional polyacrylamide gel electrophoresis (2D PAGE) is used to identify differentially expressed proteins and may be applied to biomarker discovery. A limitation of this approach is the inability to detect a protein when its concentration falls below the limit of detection. Consequently, differential expression of proteins may be missed when the level of a protein in the cases or controls is below the limit of detection for 2D PAGE. Standard statistical techniques have difficulty dealing with undetected proteins. To address this issue, we propose a mixture model that takes into account both detected and non-detected proteins. Non-detected proteins are classified either as (a) proteins that are not expressed in at least one replicate, or (b) proteins that are expressed but are below the limit of detection. We obtain maximum likelihood estimates of the parameters of the mixture model, including the group-specific probability of expression and mean expression intensities. Differentially expressed proteins can be detected by using a Likelihood Ratio Test (LRT). Our simulation results, using data generated from biological experiments, show that the likelihood model has higher statistical power than standard statistical approaches to detect differentially expressed proteins. An R package, Slider (Statistical Likelihood model for Identifying Differential Expression in R), is freely available at http://www.cebl.auckland.ac.nz/slider.php.  相似文献   

6.
Smaug, a protein repressing translation and inducing mRNA decay, directly controls an unexpectedly large number of maternal mRNAs driving early Drosophila development.See related research, http://genomebiology.com/2014/15/1/R4Regulation of translation and mRNA stability is a key aspect of early metazoan development. One of the best studied factors involved in these processes is the Drosophila protein Smaug. In this issue of Genome Biology, Chen et al. [1] report that a large number of maternal mRNAs in the fly embryo are probably regulated directly by Smaug.  相似文献   

7.
Data presentation for scientific publications in small sample size studies has not changed substantially in decades. It relies on static figures and tables that may not provide sufficient information for critical evaluation, particularly of the results from small sample size studies. Interactive graphics have the potential to transform scientific publications from static reports of experiments into interactive datasets. We designed an interactive line graph that demonstrates how dynamic alternatives to static graphics for small sample size studies allow for additional exploration of empirical datasets. This simple, free, web-based tool (http://statistika.mfub.bg.ac.rs/interactive-graph/) demonstrates the overall concept and may promote widespread use of interactive graphics.  相似文献   

8.
9.
The database of imprinted genes and parent-of-origin effects in animals (http://www.otago.ac.nz/IGC ) is a collation of genes and phenotypes for which parent-of-origin effects have been reported. The database currently includes over 220 entries, which describe over 40 imprinted genes in human, mouse and other animals. In addition a wide variety of other parent-of-origin effects, such as transmission of human disease phenotypes, transmission of QTLs, uniparental disomies and interspecies crosses are recorded. Data are accessed through a search engine and references are hyperlinked to PubMed.  相似文献   

10.
The role of alternative splicing in self-renewal, pluripotency and tissue lineage specification of human embryonic stem cells (hESCs) is largely unknown. To better define these regulatory cues, we modified the H9 hESC line to allow selection of pluripotent hESCs by neomycin resistance and cardiac progenitors by puromycin resistance. Exon-level microarray expression data from undifferentiated hESCs and cardiac and neural precursors were used to identify splice isoforms with cardiac-restricted or common cardiac/neural differentiation expression patterns. Splice events for these groups corresponded to the pathways of cytoskeletal remodeling, RNA splicing, muscle specification, and cell cycle checkpoint control as well as genes with serine/threonine kinase and helicase activity. Using a new program named AltAnalyze (http://www.AltAnalyze.org), we identified novel changes in protein domain and microRNA binding site architecture that were predicted to affect protein function and expression. These included an enrichment of splice isoforms that oppose cell-cycle arrest in hESCs and that promote calcium signaling and cardiac development in cardiac precursors. By combining genome-wide predictions of alternative splicing with new functional annotations, our data suggest potential mechanisms that may influence lineage commitment and hESC maintenance at the level of specific splice isoforms and microRNA regulation.  相似文献   

11.
The mechanism of RNA thermometers is a subject of growing interest. Also known as RNA thermosensors, these temperature-sensitive segments of the mRNA regulate gene expression by changing their secondary structure in response to temperature fluctuations. The detection of RNA thermometers in various genes of interest is valuable as it could lead to the discovery of new thermometers participating in fundamental processes such as preferential translation during heat-shock. RNAthermsw is a user-friendly webserver for predicting the location of RNA thermometers using direct temperature simulations. It operates by analyzing dotted figures generated as a result of a moving window that performs successive energy minimization folding predictions. Inputs include the RNA sequence, window size, and desired temperature change. RNAthermsw can be freely accessed at http://www.cs.bgu.ac.il/~rnathemsw/RNAthemsw/ (with the slash sign at the end). The website contains a help page with explanations regarding the exact usage.  相似文献   

12.
The interaction environment of a protein in a cellular network is important in defining the role that the protein plays in the system as a whole, and thus its potential suitability as a drug target. Despite the importance of the network environment, it is neglected during target selection for drug discovery. Here, we present the first systematic, comprehensive computational analysis of topological, community and graphical network parameters of the human interactome and identify discriminatory network patterns that strongly distinguish drug targets from the interactome as a whole. Importantly, we identify striking differences in the network behavior of targets of cancer drugs versus targets from other therapeutic areas and explore how they may relate to successful drug combinations to overcome acquired resistance to cancer drugs. We develop, computationally validate and provide the first public domain predictive algorithm for identifying druggable neighborhoods based on network parameters. We also make available full predictions for 13,345 proteins to aid target selection for drug discovery. All target predictions are available through canSAR.icr.ac.uk. Underlying data and tools are available at https://cansar.icr.ac.uk/cansar/publications/druggable_network_neighbourhoods/.  相似文献   

13.
eIF5A is an essential and evolutionary conserved translation elongation factor, which has recently been proposed to be required for the translation of proteins with consecutive prolines. The binding of eIF5A to ribosomes occurs upon its activation by hypusination, a modification that requires spermidine, an essential factor for mammalian fertility that also promotes yeast mating. We show that in response to pheromone, hypusinated eIF5A is required for shmoo formation, localization of polarisome components, induction of cell fusion proteins, and actin assembly in yeast. We also show that eIF5A is required for the translation of Bni1, a proline-rich formin involved in polarized growth during shmoo formation. Our data indicate that translation of the polyproline motifs in Bni1 is eIF5A dependent and this translation dependency is lost upon deletion of the polyprolines. Moreover, an exogenous increase in Bni1 protein levels partially restores the defect in shmoo formation seen in eIF5A mutants. Overall, our results identify eIF5A as a novel and essential regulator of yeast mating through formin translation. Since eIF5A and polyproline formins are conserved across species, our results also suggest that eIF5A-dependent translation of formins could regulate polarized growth in such processes as fertility and cancer in higher eukaryotes.  相似文献   

14.
Cox regression is commonly used to predict the outcome by the time to an event of interest and in addition, identify relevant features for survival analysis in cancer genomics. Due to the high-dimensionality of high-throughput genomic data, existing Cox models trained on any particular dataset usually generalize poorly to other independent datasets. In this paper, we propose a network-based Cox regression model called Net-Cox and applied Net-Cox for a large-scale survival analysis across multiple ovarian cancer datasets. Net-Cox integrates gene network information into the Cox''s proportional hazard model to explore the co-expression or functional relation among high-dimensional gene expression features in the gene network. Net-Cox was applied to analyze three independent gene expression datasets including the TCGA ovarian cancer dataset and two other public ovarian cancer datasets. Net-Cox with the network information from gene co-expression or functional relations identified highly consistent signature genes across the three datasets, and because of the better generalization across the datasets, Net-Cox also consistently improved the accuracy of survival prediction over the Cox models regularized by or . This study focused on analyzing the death and recurrence outcomes in the treatment of ovarian carcinoma to identify signature genes that can more reliably predict the events. The signature genes comprise dense protein-protein interaction subnetworks, enriched by extracellular matrix receptors and modulators or by nuclear signaling components downstream of extracellular signal-regulated kinases. In the laboratory validation of the signature genes, a tumor array experiment by protein staining on an independent patient cohort from Mayo Clinic showed that the protein expression of the signature gene FBN1 is a biomarker significantly associated with the early recurrence after 12 months of the treatment in the ovarian cancer patients who are initially sensitive to chemotherapy. Net-Cox toolbox is available at http://compbio.cs.umn.edu/Net-Cox/.  相似文献   

15.
16.
17.
Circular dichroism (CD) spectroscopy is a widely‐used method for characterizing the secondary structures of proteins. The well‐established and highly used analysis website, DichroWeb (located at: http://dichroweb.cryst.bbk.ac.uk/html/home.shtml) enables the facile quantitative determination of helix, sheet, and other secondary structure contents of proteins based on their CD spectra. DichroWeb includes a range of reference datasets and algorithms, plus graphical and quantitative methods for determining the quality of the analyses produced. This article describes the current website content, usage and accessibility, as well as the many upgraded features now present in this highly popular tool that was originally created nearly two decades ago.  相似文献   

18.
19.
The diversity of microbial species in a metagenomic study is commonly assessed using 16S rRNA gene sequencing. With the rapid developments in genome sequencing technologies, the focus has shifted towards the sequencing of hypervariable regions of 16S rRNA gene instead of full length gene sequencing. Therefore, 16S Classifier is developed using a machine learning method, Random Forest, for faster and accurate taxonomic classification of short hypervariable regions of 16S rRNA sequence. It displayed precision values of up to 0.91 on training datasets and the precision values of up to 0.98 on the test dataset. On real metagenomic datasets, it showed up to 99.7% accuracy at the phylum level and up to 99.0% accuracy at the genus level. 16S Classifier is available freely at http://metagenomics.iiserb.ac.in/16Sclassifier and http://metabiosys.iiserb.ac.in/16Sclassifier.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号