首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The exponential growth of the biomedical literature is making the need for efficient, accurate text-mining tools increasingly clear. The identification of named biological entities in text is a central and difficult task. We have developed an efficient algorithm and implementation of a dictionary-based approach to named entity recognition, which we here use to identify names of species and other taxa in text. The tool, SPECIES, is more than an order of magnitude faster and as accurate as existing tools. The precision and recall was assessed both on an existing gold-standard corpus and on a new corpus of 800 abstracts, which were manually annotated after the development of the tool. The corpus comprises abstracts from journals selected to represent many taxonomic groups, which gives insights into which types of organism names are hard to detect and which are easy. Finally, we have tagged organism names in the entire Medline database and developed a web resource, ORGANISMS, that makes the results accessible to the broad community of biologists. The SPECIES software is open source and can be downloaded from http://species.jensenlab.org along with dictionary files and the manually annotated gold-standard corpus. The ORGANISMS web resource can be found at http://organisms.jensenlab.org.  相似文献   

2.
Text mining for the life sciences aims to aid database curation, knowledge summarization and information retrieval through the automated processing of biomedical texts. To provide comprehensive coverage and enable full integration with existing biomolecular database records, it is crucial that text mining tools scale up to millions of articles and that their analyses can be unambiguously linked to information recorded in resources such as UniProt, KEGG, BioGRID and NCBI databases. In this study, we investigate how fully automated text mining of complex biomolecular events can be augmented with a normalization strategy that identifies biological concepts in text, mapping them to identifiers at varying levels of granularity, ranging from canonicalized symbols to unique gene and proteins and broad gene families. To this end, we have combined two state-of-the-art text mining components, previously evaluated on two community-wide challenges, and have extended and improved upon these methods by exploiting their complementary nature. Using these systems, we perform normalization and event extraction to create a large-scale resource that is publicly available, unique in semantic scope, and covers all 21.9 million PubMed abstracts and 460 thousand PubMed Central open access full-text articles. This dataset contains 40 million biomolecular events involving 76 million gene/protein mentions, linked to 122 thousand distinct genes from 5032 species across the full taxonomic tree. Detailed evaluations and analyses reveal promising results for application of this data in database and pathway curation efforts. The main software components used in this study are released under an open-source license. Further, the resulting dataset is freely accessible through a novel API, providing programmatic and customized access (http://www.evexdb.org/api/v001/). Finally, to allow for large-scale bioinformatic analyses, the entire resource is available for bulk download from http://evexdb.org/download/, under the Creative Commons – Attribution – Share Alike (CC BY-SA) license.  相似文献   

3.
While a huge amount of information about biological literature can be obtained by searching the PubMed database, reading through all the titles and abstracts resulting from such a search for useful information is inefficient. Text mining makes it possible to increase this efficiency. Some websites use text mining to gather information from the PubMed database; however, they are database-oriented, using pre-defined search keywords while lacking a query interface for user-defined search inputs. We present the PubMed Abstract Reading Helper (PubstractHelper) website which combines text mining and reading assistance for an efficient PubMed search. PubstractHelper can accept a maximum of ten groups of keywords, within each group containing up to ten keywords. The principle behind the text-mining function of PubstractHelper is that keywords contained in the same sentence are likely to be related. PubstractHelper highlights sentences with co-occurring keywords in different colors. The user can download the PMID and the abstracts with color markings to be reviewed later. The PubstractHelper website can help users to identify relevant publications based on the presence of related keywords, which should be a handy tool for their research.

Availability

http://bio.yungyun.com.tw/ATM/PubstractHelper.aspx and http://holab.med.ncku.edu.tw/ATM/PubstractHelper.aspx  相似文献   

4.
Since the discovery of microRNAs (miRNAs) only two decades ago, they have emerged as an essential component of the gene regulatory machinery. miRNAs have seemingly paradoxical features: a single miRNA is able to simultaneously target hundreds of genes, while its presence is mostly dispensable for animal viability under normal conditions. It is known that miRNAs act as stress response factors; however, it remains challenging to determine their relevant targets and the conditions under which they function. To address this challenge, we propose a new workflow for miRNA function analysis, by which we found that the evolutionarily young miRNA family, the mir-310s (mir-310/mir-311/mir-312/mir-313), are important regulators of Drosophila metabolic status. mir-310s-deficient animals have an abnormal diet-dependent expression profile for numerous diet-sensitive components, accumulate fats, and show various physiological defects. We found that the mir-310s simultaneously repress the production of several regulatory factors (Rab23, DHR96, and Ttk) of the evolutionarily conserved Hedgehog (Hh) pathway to sharpen dietary response. As the mir-310s expression is highly dynamic and nutrition sensitive, this signal relay model helps to explain the molecular mechanism governing quick and robust Hh signaling responses to nutritional changes. Additionally, we discovered a new component of the Hh signaling pathway in Drosophila, Rab23, which cell autonomously regulates Hh ligand trafficking in the germline stem cell niche. How organisms adjust to dietary fluctuations to sustain healthy homeostasis is an intriguing research topic. These data are the first to report that miRNAs can act as executives that transduce nutritional signals to an essential signaling pathway. This suggests miRNAs as plausible therapeutic agents that can be used in combination with low calorie and cholesterol diets to manage quick and precise tissue-specific responses to nutritional changes.  相似文献   

5.
Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES), the National Center for Biomedical Ontology (NCBO) Annotator, the Biomedical Concept Annotation System (BeCAS) and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems’ output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74%) and their quality (best F1-measure of 33%), independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%), the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems’ annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content of the ShARe/CLEF (https://sites.google.com/site/shareclefehealth/data) and i2b2 (https://i2b2.org/NLP/DataSets/) corpora needs to be requested with the individual corpus providers.  相似文献   

6.
Cdk1 activity drives both mitotic entry and the metaphase-to-anaphase transition in all eukaryotes. The kinase Wee1 and the phosphatase Cdc25 regulate the mitotic activity of Cdk1 by the reversible phosphorylation of a conserved tyrosine residue. Mutation of cdc25 in Schizosaccharomyces pombe blocks Cdk1 dephosphorylation and causes cell cycle arrest. In contrast, deletion of MIH1, the cdc25 homolog in Saccharomyces cerevisiae, is viable. Although Cdk1-Y19 phosphorylation is elevated during mitosis in mih1∆ cells, Cdk1 is dephosphorylated as cells progress into G1, suggesting that additional phosphatases regulate Cdk1 dephosphorylation. Here we show that the phosphatase Ptp1 also regulates Cdk1 dephosphorylation in vivo and can directly dephosphorylate Cdk1 in vitro. Using a novel in vivo phosphatase assay, we also show that PP2A bound to Rts1, the budding yeast B56-regulatory subunit, regulates dephosphorylation of Cdk1 independently of a function regulating Swe1, Mih1, or Ptp1, suggesting that PP2ARts1 either directly dephosphorylates Cdk1-Y19 or regulates an unidentified phosphatase.  相似文献   

7.
8.
Despite the importance of clathrin-mediated endocytosis (CME) for cell biology, it is unclear if all components of the machinery have been discovered and many regulatory aspects remain poorly understood. Here, using Saccharomyces cerevisiae and a fluorescence microscopy screening approach we identify previously unknown regulatory factors of the endocytic machinery. We further studied the top scoring protein identified in the screen, Ubx3, a member of the conserved ubiquitin regulatory X (UBX) protein family. In vivo and in vitro approaches demonstrate that Ubx3 is a new coat component. Ubx3-GFP has typical endocytic coat protein dynamics with a patch lifetime of 45 ± 3 sec. Ubx3 contains a W-box that mediates physical interaction with clathrin and Ubx3-GFP patch lifetime depends on clathrin. Deletion of the UBX3 gene caused defects in the uptake of Lucifer Yellow and the methionine transporter Mup1 demonstrating that Ubx3 is needed for efficient endocytosis. Further, the UBX domain is required both for localization and function of Ubx3 at endocytic sites. Mechanistically, Ubx3 regulates dynamics and patch lifetime of the early arriving protein Ede1 but not later arriving coat proteins or actin assembly. Conversely, Ede1 regulates the patch lifetime of Ubx3. Ubx3 likely regulates CME via the AAA-ATPase Cdc48, a ubiquitin-editing complex. Our results uncovered new components of the CME machinery that regulate this fundamental process.  相似文献   

9.
Homologous recombination is associated with the dynamic assembly and disassembly of DNA–protein complexes. Assembly of a nucleoprotein filament comprising ssDNA and the RecA homolog, Rad51, is a key step required for homology search during recombination. The budding yeast Srs2 DNA translocase is known to dismantle Rad51 filament in vitro. However, there is limited evidence to support the dismantling activity of Srs2 in vivo. Here, we show that Srs2 indeed disrupts Rad51-containing complexes from chromosomes during meiosis. Overexpression of Srs2 during the meiotic prophase impairs meiotic recombination and removes Rad51 from meiotic chromosomes. This dismantling activity is specific for Rad51, as Srs2 Overexpression does not remove Dmc1 (a meiosis-specific Rad51 homolog), Rad52 (a Rad51 mediator), or replication protein A (RPA; a single-stranded DNA-binding protein). Rather, RPA replaces Rad51 under these conditions. A mutant Srs2 lacking helicase activity cannot remove Rad51 from meiotic chromosomes. Interestingly, the Rad51-binding domain of Srs2, which is critical for Rad51-dismantling activity in vitro, is not essential for this activity in vivo. Our results suggest that a precise level of Srs2, in the form of the Srs2 translocase, is required to appropriately regulate the Rad51 nucleoprotein filament dynamics during meiosis.  相似文献   

10.
Cytohesins are Arf guanine nucleotide exchange factors (GEFs) that regulate membrane trafficking and actin cytoskeletal dynamics. We report here that GRP-1, the sole Caenorhabditis elegans cytohesin, controls the asymmetric divisions of certain neuroblasts that divide to produce a larger neuronal precursor or neuron and a smaller cell fated to die. In the Q neuroblast lineage, loss of GRP-1 led to the production of daughter cells that are more similar in size and to the transformation of the normally apoptotic daughter into its sister, resulting in the production of extra neurons. Genetic interactions suggest that GRP-1 functions with the previously described Arf GAP CNT-2 and two other Arf GEFs, EFA-6 and BRIS-1, to regulate the activity of Arf GTPases. In agreement with this model, we show that GRP-1’s GEF activity, mediated by its SEC7 domain, is necessary for the posterior Q cell (Q.p) neuroblast division and that both GRP-1 and CNT-2 function in the Q.posterior Q daughter cell (Q.p) to promote its asymmetry. Although functional GFP-tagged GRP-1 proteins localized to the nucleus, the extra cell defects were rescued by targeting the Arf GEF activity of GRP-1 to the plasma membrane, suggesting that GRP-1 acts at the plasma membrane. The detection of endogenous GRP-1 protein at cytokinesis remnants, or midbodies, is consistent with GRP-1 functioning at the plasma membrane and perhaps at the cytokinetic furrow to promote the asymmetry of the divisions that require its function.  相似文献   

11.
eIF5A is an essential and evolutionary conserved translation elongation factor, which has recently been proposed to be required for the translation of proteins with consecutive prolines. The binding of eIF5A to ribosomes occurs upon its activation by hypusination, a modification that requires spermidine, an essential factor for mammalian fertility that also promotes yeast mating. We show that in response to pheromone, hypusinated eIF5A is required for shmoo formation, localization of polarisome components, induction of cell fusion proteins, and actin assembly in yeast. We also show that eIF5A is required for the translation of Bni1, a proline-rich formin involved in polarized growth during shmoo formation. Our data indicate that translation of the polyproline motifs in Bni1 is eIF5A dependent and this translation dependency is lost upon deletion of the polyprolines. Moreover, an exogenous increase in Bni1 protein levels partially restores the defect in shmoo formation seen in eIF5A mutants. Overall, our results identify eIF5A as a novel and essential regulator of yeast mating through formin translation. Since eIF5A and polyproline formins are conserved across species, our results also suggest that eIF5A-dependent translation of formins could regulate polarized growth in such processes as fertility and cancer in higher eukaryotes.  相似文献   

12.
13.
14.
Dbf4-dependent kinase (DDK) and cyclin-dependent kinase (CDK) are essential to initiate DNA replication at individual origins. During replication stress, the S-phase checkpoint inhibits the DDK- and CDK-dependent activation of late replication origins. Rad53 kinase is a central effector of the replication checkpoint and both binds to and phosphorylates Dbf4 to prevent late-origin firing. The molecular basis for the Rad53Dbf4 physical interaction is not clear but occurs through the Dbf4 N terminus. Here we found that both Rad53 FHA1 and FHA2 domains, which specifically recognize phospho-threonine (pT), interacted with Dbf4 through an N-terminal sequence and an adjacent BRCT domain. Purified Rad53 FHA1 domain (but not FHA2) bound to a pT Dbf4 peptide in vitro, suggesting a possible phospho-threonine-dependent interaction between FHA1 and Dbf4. The Dbf4Rad53 interaction is governed by multiple contacts that are separable from the Cdc5- and Msa1-binding sites in the Dbf4 N terminus. Importantly, abrogation of the Rad53Dbf4 physical interaction blocked Dbf4 phosphorylation and allowed late-origin firing during replication checkpoint activation. This indicated that Rad53 must stably bind to Dbf4 to regulate its activity.  相似文献   

15.
Kinetochores are conserved protein complexes that bind the replicated chromosomes to the mitotic spindle and then direct their segregation. To better comprehend Saccharomyces cerevisiae kinetochore function, we dissected the phospho-regulated dynamic interaction between conserved kinetochore protein Cnn1CENP-T, the centromere region, and the Ndc80 complex through the cell cycle. Cnn1 localizes to kinetochores at basal levels from G1 through metaphase but accumulates abruptly at anaphase onset. How Cnn1 is recruited and which activities regulate its dynamic localization are unclear. We show that Cnn1 harbors two kinetochore-localization activities: a C-terminal histone-fold domain (HFD) that associates with the centromere region and a N-terminal Spc24/Spc25 interaction sequence that mediates linkage to the microtubule-binding Ndc80 complex. We demonstrate that the established Ndc80 binding site in the N terminus of Cnn1, Cnn160–84, should be extended with flanking residues, Cnn125–91, to allow near maximal binding affinity to Ndc80. Cnn1 localization was proposed to depend on Mps1 kinase activity at Cnn1–S74, based on in vitro experiments demonstrating the Cnn1Ndc80 complex interaction. We demonstrate that from G1 through metaphase, Cnn1 localizes via both its HFD and N-terminal Spc24/Spc25 interaction sequence, and deletion or mutation of either region results in anomalous Cnn1 kinetochore levels. At anaphase onset (when Mps1 activity decreases) Cnn1 becomes enriched mainly via the N-terminal Spc24/Spc25 interaction sequence. In sum, we provide the first in vivo evidence of Cnn1 preanaphase linkages with the kinetochore and enrichment of the linkages during anaphase.  相似文献   

16.
The yeast Dbf4-dependent kinase (DDK) (composed of Dbf4 and Cdc7 subunits) is an essential, conserved Ser/Thr protein kinase that regulates multiple processes in the cell, including DNA replication, recombination and induced mutagenesis. Only DDK substrates important for replication and recombination have been identified. Consequently, the mechanism by which DDK regulates mutagenesis is unknown. The yeast mcm5-bob1 mutation that bypasses DDK’s essential role in DNA replication was used here to examine whether loss of DDK affects spontaneous as well as induced mutagenesis. Using the sensitive lys2ΔA746 frameshift reversion assay, we show DDK is required to generate “complex” spontaneous mutations, which are a hallmark of the Polζ translesion synthesis DNA polymerase. DDK co-immunoprecipitated with the Rev7 regulatory, but not with the Rev3 polymerase subunit of Polζ. Conversely, Rev7 bound mainly to the Cdc7 kinase subunit and not to Dbf4. The Rev7 subunit of Polζ may be regulated by DDK phosphorylation as immunoprecipitates of yeast Cdc7 and also recombinant Xenopus DDK phosphorylated GST-Rev7 in vitro. In addition to promoting Polζ-dependent mutagenesis, DDK was also important for generating Polζ-independent large deletions that revert the lys2ΔA746 allele. The decrease in large deletions observed in the absence of DDK likely results from an increase in the rate of replication fork restart after an encounter with spontaneous DNA damage. Finally, nonepistatic, additive/synergistic UV sensitivity was observed in cdc7Δ pol32Δ and cdc7Δ pol30-K127R,K164R double mutants, suggesting that DDK may regulate Rev7 protein during postreplication “gap filling” rather than during “polymerase switching” by ubiquitinated and sumoylated modified Pol30 (PCNA) and Pol32.  相似文献   

17.
18.
Asymmetric cell divisions produce daughter cells with distinct sizes and fates, a process important for generating cell diversity during development. Many Caenorhabditis elegans neuroblasts, including the posterior daughter of the Q cell (Q.p), divide to produce a larger neuron or neuronal precursor and a smaller cell that dies. These size and fate asymmetries require the gene pig-1, which encodes a protein orthologous to vertebrate MELK and belongs to the AMPK-related family of kinases. Members of this family can be phosphorylated and activated by the tumor suppressor kinase LKB1, a conserved polarity regulator of epithelial cells and neurons. In this study, we present evidence that the C. elegans orthologs of LKB1 (PAR-4) and its partners STRAD (STRD-1) and MO25 (MOP-25.2) regulate the asymmetry of the Q.p neuroblast division. We show that PAR-4 and STRD-1 act in the Q lineage and function genetically in the same pathway as PIG-1. A conserved threonine residue (T169) in the PIG-1 activation loop is essential for PIG-1 activity, consistent with the model that PAR-4 (or another PAR-4-regulated kinase) phosphorylates and activates PIG-1. We also demonstrate that PIG-1 localizes to centrosomes during cell divisions of the Q lineage, but this localization does not depend on T169 or PAR-4. We propose that a PAR-4-STRD-1 complex stimulates PIG-1 kinase activity to promote asymmetric neuroblast divisions and the generation of daughter cells with distinct fates. Changes in cell fate may underlie many of the abnormal behaviors exhibited by cells after loss of PAR-4 or LKB1.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号