首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

We study the adaptation of Link Grammar Parser to the biomedical sublanguage with a focus on domain terms not found in a general parser lexicon. Using two biomedical corpora, we implement and evaluate three approaches to addressing unknown words: automatic lexicon expansion, the use of morphological clues, and disambiguation using a part-of-speech tagger. We evaluate each approach separately for its effect on parsing performance and consider combinations of these approaches.

Results

In addition to a 45% increase in parsing efficiency, we find that the best approach, incorporating information from a domain part-of-speech tagger, offers a statistically significant 10% relative decrease in error.

Conclusion

When available, a high-quality domain part-of-speech tagger is the best solution to unknown word issues in the domain adaptation of a general parser. In the absence of such a resource, surface clues can provide remarkably good coverage and performance when tuned to the domain. The adapted parser is available under an open-source license.
  相似文献   

2.
Background

We study the adaptation of Link Grammar Parser to the biomedical sublanguage with a focus on domain terms not found in a general parser lexicon. Using two biomedical corpora, we implement and evaluate three approaches to addressing unknown words: automatic lexicon expansion, the use of morphological clues, and disambiguation using a part-of-speech tagger. We evaluate each approach separately for its effect on parsing performance and consider combinations of these approaches.

Results

In addition to a 45% increase in parsing efficiency, we find that the best approach, incorporating information from a domain part-of-speech tagger, offers a statistically significant 10% relative decrease in error.

Conclusion

When available, a high-quality domain part-of-speech tagger is the best solution to unknown word issues in the domain adaptation of a general parser. In the absence of such a resource, surface clues can provide remarkably good coverage and performance when tuned to the domain. The adapted parser is available under an open-source license.

  相似文献   

3.
Natural language processing is a fast and automatized process. A crucial part of this process is parsing, the online incremental construction of a syntactic structure. The aim of this study was to test whether a wh-filler extracted from an embedded clause is initially attached as the object of the matrix verb with subsequent reanalysis, and if so, whether the plausibility of such an attachment has an effect on reaction time. Finally, we wanted to examine whether subcategorization plays a role. We used a method called G-Maze to measure response time in a self-paced reading design. The experiments confirmed that there is early attachment of fillers to the matrix verb. When this attachment is implausible, the off-line acceptability of the whole sentence is significantly reduced. The on-line results showed that G-Maze was highly suited for this type of experiment. In accordance with our predictions, the results suggest that the parser ignores (or has no access to information about) implausibility and attaches fillers as soon as possible to the matrix verb. However, the results also show that the parser uses the subcategorization frame of the matrix verb. In short, the parser ignores semantic information and allows implausible attachments but adheres to information about which type of object a verb can take, ensuring that the parser does not make impossible attachments. We argue that the evidence supports a syntactic parser informed by syntactic cues, rather than one guided by semantic cues or one that is blind, or completely autonomous.  相似文献   

4.
5.
A natural language parser implemented entirely in simulated neurons is described. It produces a semantic representation based on frames. It parses solely using simulated fatiguing Leaky Integrate and Fire neurons, that are a relatively accurate biological model that is simulated efficiently. The model works on discrete cycles that simulate 10 ms of biological time, so the parser has a simple mapping to psychological parsing time. Comparisons to human parsing studies show that the parser closely approximates this data. The parser makes use of Cell Assemblies and the semantics of lexical items is represented by overlapping hierarchical Cell Assemblies so that semantically related items share neurons. This semantic encoding is used to resolve prepositional phrase attachment ambiguities encountered during parsing. Consequently, the parser provides a neurally-based cognitive model of parsing.  相似文献   

6.

Background  

Automatic semantic role labeling (SRL) is a natural language processing (NLP) technique that maps sentences to semantic representations. This technique has been widely studied in the recent years, but mostly with data in newswire domains. Here, we report on a SRL model for identifying the semantic roles of biomedical predicates describing protein transport in GeneRIFs – manually curated sentences focusing on gene functions. To avoid the computational cost of syntactic parsing, and because the boundaries of our protein transport roles often did not match up with syntactic phrase boundaries, we approached this problem with a word-chunking paradigm and trained support vector machine classifiers to classify words as being at the beginning, inside or outside of a protein transport role.  相似文献   

7.
8.
Atherosclerosis is an immunoinflammatory process that involves complex interactions between the vessel wall and blood components and is thought to be initiated by endothelial dysfunction [13]. Extracellular nucleotides that are released from a variety of arterial and blood cells [4] can bind to P2 receptors and modulate proliferation and migration of smooth muscle cells (SMC), which is known to be involved in intimal hyperplasia that accompanies atherosclerosis and postangioplasty restenosis [5]. In addition, P2 receptors mediate many other functions, including platelet aggregation, leukocyte adherence, and arterial vasomotoricity. A direct pathological role of P2 receptors is reinforced by recent evidence showing that up-regulation and activation of P2Y2 receptors in rabbit arteries mediates intimal hyperplasia [6]. In addition, up-regulation of functional P2Y receptors also has been demonstrated in the basilar artery of the rat double-hemorrhage model [7] and in coronary arteries of diabetic dyslipidemic pigs [8]. It has been proposed that up-regulation of P2Y receptors may be a potential diagnostic indicator for the early stages of atherosclerosis [9]. Therefore, particular effort must be made to understand the consequences of nucleotide release from cells in the cardiovascular system and the subsequent effects of P2 nucleotide receptor activation in blood vessels, which may reveal novel therapeutic strategies for atherosclerosis and restenosis after angioplasty.  相似文献   

9.
MOTIVATION: The field of 'DNA linguistics' has emerged from pioneering work in computational linguistics and molecular biology. Most formal grammars in this field are expressed using Definite Clause Grammars but these have computational limitations which must be overcome. The present study provides a new DNA parsing system, comprising a logic grammar formalism called Basic Gene Grammars and a bidirectional chart parser DNA-ChartParser. RESULTS: The use of Basic Gene Grammars is demonstrated in representing many formulations of the knowledge of Escherichia coli promoters, including knowledge acquired from human experts, consensus sequences, statistics (weight matrices), symbolic learning, and neural network learning. The DNA-ChartParser provides bidirectional parsing facilities for BGGs in handling overlapping categories, gap categories, approximate pattern matching, and constraints. Basic Gene Grammars and the DNA-ChartParser allowed different sources of knowledge for recognizing E.coli promoters to be combined to achieve better accuracy as assessed by parsing these DNA sequences in real-world data sets.  相似文献   

10.
When we read or listen to language, we are faced with the challenge of inferring intended messages from noisy input. This challenge is exacerbated by considerable variability between and within speakers. Focusing on syntactic processing (parsing), we test the hypothesis that language comprehenders rapidly adapt to the syntactic statistics of novel linguistic environments (e.g., speakers or genres). Two self-paced reading experiments investigate changes in readers’ syntactic expectations based on repeated exposure to sentences with temporary syntactic ambiguities (so-called “garden path sentences”). These sentences typically lead to a clear expectation violation signature when the temporary ambiguity is resolved to an a priori less expected structure (e.g., based on the statistics of the lexical context). We find that comprehenders rapidly adapt their syntactic expectations to converge towards the local statistics of novel environments. Specifically, repeated exposure to a priori unexpected structures can reduce, and even completely undo, their processing disadvantage (Experiment 1). The opposite is also observed: a priori expected structures become less expected (even eliciting garden paths) in environments where they are hardly ever observed (Experiment 2). Our findings suggest that, when changes in syntactic statistics are to be expected (e.g., when entering a novel environment), comprehenders can rapidly adapt their expectations, thereby overcoming the processing disadvantage that mistaken expectations would otherwise cause. Our findings take a step towards unifying insights from research in expectation-based models of language processing, syntactic priming, and statistical learning.  相似文献   

11.

Background  

The lack of detailed understanding of the mechanism of action of many biowarfare agents poses an immediate challenge to biodefense efforts. Many potential bioweapons have been shown to affect the cellular pathways controlling apoptosis [14]. For example, pathogen-produced exotoxins such as Staphylococcal Enterotoxin B (SEB) and Anthrax Lethal Factor (LF) have been shown to disrupt the Fas-mediated apoptotic pathway [2, 4]. To evaluate how these agents affect these pathways it is first necessary to understand the dynamics of a normally functioning apoptosis network. This can then serve as a baseline against which a pathogen perturbed system can be compared. Such comparisons can expose both the proteins most susceptible to alteration by the agent as well as the most critical reaction rates to better instill control on a biological network.  相似文献   

12.
Gap-junctional coupling is an important way of communication between neurons and other excitable cells. Strong electrical coupling synchronizes activity across cell ensembles. Surprisingly, in the presence of noise synchronous oscillations generated by an electrically coupled network may differ qualitatively from the oscillations produced by uncoupled individual cells forming the network. A prominent example of such behavior is the synchronized bursting in islets of Langerhans formed by pancreatic β-cells, which in isolation are known to exhibit irregular spiking (Sherman and Rinzel, Biophys J 54:411–425, 1988; Sherman and Rinzel, Biophys J 59:547–559, 1991). At the heart of this intriguing phenomenon lies denoising, a remarkable ability of electrical coupling to diminish the effects of noise acting on individual cells. In this paper, building on an earlier analysis of denoising in networks of integrate-and-fire neurons (Medvedev, Neural Comput 21 (11):3057–3078, 2009) and our recent study of spontaneous activity in a closely related model of the Locus Coeruleus network (Medvedev and Zhuravytska, The geometry of spontaneous spiking in neuronal networks, submitted, 2012), we derive quantitative estimates characterizing denoising in electrically coupled networks of conductance-based models of square wave bursting cells. Our analysis reveals the interplay of the intrinsic properties of the individual cells and network topology and their respective contributions to this important effect. In particular, we show that networks on graphs with large algebraic connectivity (Fiedler, Czech Math J 23(98):298–305, 1973) or small total effective resistance (Bollobas, Modern graph theory, Graduate Texts in Mathematics, vol. 184, Springer, New York, 1998) are better equipped for implementing denoising. As a by-product of the analysis of denoising, we analytically estimate the rate with which trajectories converge to the synchronization subspace and the stability of the latter to random perturbations. These estimates reveal the role of the network topology in synchronization. The analysis is complemented by numerical simulations of electrically coupled conductance-based networks. Taken together, these results explain the mechanisms underlying synchronization and denoising in an important class of biological models.  相似文献   

13.
The Protein Kinase C family of enzymes is a group of serine/threonine kinases that play central roles in cell-cycle regulation, development and cancer. A key step in the activation of PKC is translocation to membranes and binding of membrane-associated activators including diacylglycerol (DAG). Interaction of novel and conventional isotypes of PKC with DAG and phorbol esters occurs through the two C1 regulatory domains (C1A and C1B), which exhibit distinct ligand binding selectivity that likely controls enzyme activation by different co-activators. PKC has also been implicated in physiological responses to alcohol consumption and it has been proposed that PKCα (Slater et al. J Biol Chem 272(10):6167–6173, 1997; Slater et al. Biochemistry 43(23):7601–7609, 2004), PKCε (Das et al. Biochem J 421(3):405–413, 2009) and PKCδ (Das et al. J Biol Chem 279(36):37964–37972, 2004; Das et al. Protein Sci 15(9):2107–2119, 2006) contain specific alcohol-binding sites in their C1 domains. We are interested in understanding how ethanol affects signal transduction processes through its affects on the structure and function of the C1 domains of PKC. Here we present the 1H, 15N and 13C NMR chemical shift assignments for the Rattus norvegicus PKCδ C1A and C1B proteins.  相似文献   

14.
Timely release of dopamine (DA) at the striatum seems to be important for reinforcement learning (RL) mediated by the basal ganglia. Houk et al. (in: Houk et al (eds) Models of information processing in the basal ganglia, (1995) proposed a cellular signaling pathway model to characterize the interaction between DA and glutamate pathways that have a role in RL. The model simulation results, using GENESIS KINETIKIT simulator, point out that there is not only prolongation of duration as proposed by Houk et al. (1995), but also an enhancement in the amplitude of autophosphorylation of CaMKII. Further, the autophosphorylated form of CaMKII may form a basis for the “eligibility trace” condition required in RL. This simulation study is the first of its kind to support the comprehensive theoretical proposal of Houk et al. (1995).  相似文献   

15.
Gene Structure Prediction by Linguistic Methods   总被引:1,自引:0,他引:1  
The higher-order structure of genes and other features of biological sequences can be described by means of formal grammars. These grammars can then be used by general-purpose parsers to detect and to assemble such structures by means of syntactic pattern recognition. We describe a grammar and parser for eukaryotic protein-encodillg genes, which by some measures is as effective as current connectionist and combinatorial algorithms in predicting gene structures for sequence database entries. Parameters of the grammar rules are optimized for several different species, and mixing experiments are performed to determine the degree of species specificity and the relative importance of compositional, signal-based, and syntactic components in gene prediction.  相似文献   

16.
17.
Intense nanosecond pulsed electric fields (nsPEFs) have been shown to induce, on intracellular structures, interesting effects dependent on electrical exposure conditions (pulse length and amplitude, repetition frequency and number of pulses), which are known in the literature as “bioelectrical effects” (Schoenbach et al., IEEE Trans Plasma Sci 30:293–300, 2002). In particular, pulses with a shorter width than the plasma membrane charging time constant (about 100 ns for mammalian cells) can penetrate the cell and trigger effects such as permeabilization of intracellular membranes, release of Ca2+ and apoptosis induction. Moreover, the observed effects have led to exploration of medical applications, like the treatment of melanoma tumors (Nuccitelli et al., Biochem Biophys Res Commun 343:351–360, 2006). Pulsed electric fields allowing such effects usually range from several tens to a few hundred nanoseconds in duration and from a few to several tens of megavolts per meter in amplitude (Schoenbach et al., IEEE Trans Diel Elec Insul 14:1088–1109, 2007); however, the biological effects of subnanosecond pulses have been also investigated (Schoenbach et al., IEEE Trans Plasma Sci 36:414–422, 2008). The use of such a large variety of pulse parameters suggests that highly flexible pulse-generating systems, able to deliver wide ranges of pulse durations and amplitudes, are strongly required in order to explore effects and applications related to different exposure conditions. The Blumlein pulse-forming network is an often-employed circuit topology for the generation of high-voltage electric pulses with fixed pulse duration. An innovative modification to the Blumlein circuit has been recently devised which allows generation of pulses with variable amplitude, duration and polarity. Two different modified Blumlein pulse-generating systems are presented in this article, the first based on a coaxial cable configuration, matching microscopic slides as a pulse-delivery system, and the other based on microstrip transmission lines and designed to match cuvettes for the exposure of cell suspensions.  相似文献   

18.
Recently Haas et al. (J Neurophysiol 96: 3305–3313, 2006), observed a novel form of spike timing dependent plasticity (iSTDP) in GABAergic synaptic couplings in layer II of the entorhinal cortex. Depending on the relative timings of the presynaptic input at time t pre and the postsynaptic excitation at time t post, the synapse is strengthened (Δt = t post − t pre > 0) or weakened (Δt < 0). The temporal dynamic range of the observed STDP rule was found to lie in the higher gamma frequency band (≥40 Hz), a frequency range important for several vital neuronal tasks. In this paper we study the function of this novel form of iSTDP in the synchronization of the inhibitory neuronal network. In particular we consider a network of two unidirectionally coupled interneurons (UCI) and two mutually coupled interneurons (MCI), in the presence of heterogeneity in the intrinsic firing rates of each coupled neuron. Using the method of spike time response curve (STRC), we show how iSTDP influences the dynamics of the coupled neurons, such that the pair synchronizes under moderately large heterogeneity in the firing rates. Using the general properties of the STRC for a Type-1 neuron model (Ermentrout, Neural Comput 8:979–1001, 1996) and the observed iSTDP we determine conditions on the initial configuration of the UCI network that would result in 1:1 in-phase synchrony between the two coupled neurons. We then demonstrate a similar enhancement of synchrony in the MCI with dynamic synaptic modulation. For the MCI we also consider heterogeneity introduced in the network through the synaptic parameters: the synaptic decay time of mutual inhibition and the self inhibition synaptic strength. We show that the MCI exhibits enhanced synchrony in the presence of all the above mentioned sources of heterogeneity and the mechanism for this enhanced synchrony is similar to the case of the UCI.  相似文献   

19.
Several pilot experiments have indicated that improvements in older NMR structures can be expected by applying modern software and new protocols (Nabuurs et al. in Proteins 55:483–186, 2004; Nederveen et al. in Proteins 59:662–672, 2005; Saccenti and Rosato in J Biomol NMR 40:251–261, 2008). A recent large scale X-ray study also has shown that modern software can significantly improve the quality of X-ray structures that were deposited more than a few years ago (Joosten et al. in J. Appl Crystallogr 42:376–384, 2009; Sanderson in Nature 459:1038–1039, 2009). Recalculation of three-dimensional coordinates requires that the original experimental data are available and complete, and are semantically and syntactically correct, or are at least correct enough to be reconstructed. For multiple reasons, including a lack of standards, the heterogeneity of the experimental data and the many NMR experiment types, it has not been practical to parse a large proportion of the originally deposited NMR experimental data files related to protein NMR structures. This has made impractical the automatic recalculation, and thus improvement, of the three dimensional coordinates of these structures. We here describe a large-scale international collaborative effort to make all deposited experimental NMR data semantically and syntactically homogeneous, and thus useful for further research. A total of 4,014 out of 5,266 entries were ‘cleaned’ in this process. For 1,387 entries, human intervention was needed. Continuous efforts in automating the parsing of both old, and newly deposited files is steadily decreasing this fraction. The cleaned data files are available from the NMR restraints grid at .  相似文献   

20.
A0, a Cu(II) thioxotriazole complex, produces severe cytotoxic effects on HT1080 human fibrosarcoma cells with a potency comparable to that exhibited by cisplatin. A0 induced a characteristic series of changes, hallmarked by the formation of eosin- and Sudan Black-B-negative vacuoles. No evidence of nuclear fragmentation or caspase-3 activation was detected in cells treated with A0 which, rather, inhibited cisplatin-stimulated caspase-3 activity. Membrane functional integrity, assessed with calcein and propidium iodide, was spared until the late stages of the death process induced by the copper complex. Vacuoles were negative to the autophagy marker monodansylcadaverine and their formation was not blocked by 3-methyladenine, an inhibitor of autophagic processes. Negativity to the extracellular marker pyranine excluded vacuole derivation from the extracellular fluid. Ultrastructural analysis indicated that A0 caused the appearance of many electronlight cytoplasmic vesicles, possibly related to the endoplasmic reticulum, which progressively enlarge and coalesce to form large vacuolar structures that eventually fill the cytoplasm. It is concluded that A0 triggers a non-apoptotic, type 3B programmed cell death (Clarke in Anat Embryol (Berl) 181:195–213, 1990), characterized by an extensive cytoplasmic vacuolization. This peculiar cytotoxicity pattern may render the employment of A0 to be of particular interest in apoptosis-resistant cell models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号