首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
There is a trend towards automatic analysis of large amounts of literature in the biomedical domain. However, this can be effective only if the ambiguity in natural language is resolved. In this paper, the current state of research in word sense disambiguation (WSD) is reviewed. Several methods for WSD have already been proposed, but many systems have been tested only on evaluation sets of limited size. There are currently only very few applications of WSD in the biomedical domain. The current direction of research points towards statistically based algorithms that use existing curated data and can be applied to large sets of biomedical literature. There is a need for manually tagged evaluation sets to test WSD algorithms in the biomedical domain. WSD algorithms should preferably be able to take into account both known and unknown senses of a word. Without WSD, automatic metaanalysis of large corpora of text will be error prone.  相似文献   

2.
Word Sense Disambiguation (WSD) is the task of determining which sense of an ambiguous word (word with multiple meanings) is chosen in a particular use of that word, by considering its context. A sentence is considered ambiguous if it contains ambiguous word(s). Practically, any sentence that has been classified as ambiguous usually has multiple interpretations, but just one of them presents the correct interpretation. We propose an unsupervised method that exploits knowledge based approaches for word sense disambiguation using Harmony Search Algorithm (HSA) based on a Stanford dependencies generator (HSDG). The role of the dependency generator is to parse sentences to obtain their dependency relations. Whereas, the goal of using the HSA is to maximize the overall semantic similarity of the set of parsed words. HSA invokes a combination of semantic similarity and relatedness measurements, i.e., Jiang and Conrath (jcn) and an adapted Lesk algorithm, to perform the HSA fitness function. Our proposed method was experimented on benchmark datasets, which yielded results comparable to the state-of-the-art WSD methods. In order to evaluate the effectiveness of the dependency generator, we perform the same methodology without the parser, but with a window of words. The empirical results demonstrate that the proposed method is able to produce effective solutions for most instances of the datasets used.  相似文献   

3.

Background  

Ontology term labels can be ambiguous and have multiple senses. While this is no problem for human annotators, it is a challenge to automated methods, which identify ontology terms in text. Classical approaches to word sense disambiguation use co-occurring words or terms. However, most treat ontologies as simple terminologies, without making use of the ontology structure or the semantic similarity between terms. Another useful source of information for disambiguation are metadata. Here, we systematically compare three approaches to word sense disambiguation, which use ontologies and metadata, respectively.  相似文献   

4.

Background  

Word sense disambiguation (WSD) is critical in the biomedical domain for improving the precision of natural language processing (NLP), text mining, and information retrieval systems because ambiguous words negatively impact accurate access to literature containing biomolecular entities, such as genes, proteins, cells, diseases, and other important entities. Automated techniques have been developed that address the WSD problem for a number of text processing situations, but the problem is still a challenging one. Supervised WSD machine learning (ML) methods have been applied in the biomedical domain and have shown promising results, but the results typically incorporate a number of confounding factors, and it is problematic to truly understand the effectiveness and generalizability of the methods because these factors interact with each other and affect the final results. Thus, there is a need to explicitly address the factors and to systematically quantify their effects on performance.  相似文献   

5.

Background  

Word sense disambiguation (WSD) algorithms attempt to select the proper sense of ambiguous terms in text. Resources like the UMLS provide a reference thesaurus to be used to annotate the biomedical literature. Statistical learning approaches have produced good results, but the size of the UMLS makes the production of training data infeasible to cover all the domain.  相似文献   

6.

Background

In biomedical research, events revealing complex relations between entities play an important role. Biomedical event trigger identification has become a research hotspot since its important role in biomedical event extraction. Traditional machine learning methods, such as support vector machines (SVM) and maxent classifiers, which aim to manually design powerful features fed to the classifiers, depend on the understanding of the specific task and cannot generalize to the new domain or new examples.

Methods

In this paper, we propose an approach which utilizes neural network model based on dependency-based word embedding to automatically learn significant features from raw input for trigger classification. First, we employ Word2vecf, the modified version of Word2vec, to learn word embedding with rich semantic and functional information based on dependency relation tree. Then neural network architecture is used to learn more significant feature representation based on raw dependency-based word embedding. Meanwhile, we dynamically adjust the embedding while training for adapting to the trigger classification task. Finally, softmax classifier labels the examples by specific trigger class using the features learned by the model.

Results

The experimental results show that our approach achieves a micro-averaging F1 score of 78.27 and a macro-averaging F1 score of 76.94 % in significant trigger classes, and performs better than baseline methods. In addition, we can achieve the semantic distributed representation of every trigger word.
  相似文献   

7.
Background

We study the adaptation of Link Grammar Parser to the biomedical sublanguage with a focus on domain terms not found in a general parser lexicon. Using two biomedical corpora, we implement and evaluate three approaches to addressing unknown words: automatic lexicon expansion, the use of morphological clues, and disambiguation using a part-of-speech tagger. We evaluate each approach separately for its effect on parsing performance and consider combinations of these approaches.

Results

In addition to a 45% increase in parsing efficiency, we find that the best approach, incorporating information from a domain part-of-speech tagger, offers a statistically significant 10% relative decrease in error.

Conclusion

When available, a high-quality domain part-of-speech tagger is the best solution to unknown word issues in the domain adaptation of a general parser. In the absence of such a resource, surface clues can provide remarkably good coverage and performance when tuned to the domain. The adapted parser is available under an open-source license.

  相似文献   

8.
9.
K Matsuno 《Bio Systems》1992,27(4):235-239
The natural language processor in the brain can cope with non-programmable computation. The average number of different lexical meanings per word serves as a quantitative figure in terms of which the extent of being non-programmable can be evaluated. The possible maximum average number of different lexical meanings per word that the brain of the subject reading the text can cope with while comprehending the context is found to be 3.3 with its standard deviation 0.15, beyond which the brain can no more succeed in comprehending the context. In contrast, the maximum average number of different lexical meanings per word that would make lexical disambiguation programmable is e = 2.718. Natural language processing in the brain is non-programmable in the sense that the manageable average number of different meanings per word is greater than e, but does not exceed roughly 3.3.  相似文献   

10.
11.

Background

We study the adaptation of Link Grammar Parser to the biomedical sublanguage with a focus on domain terms not found in a general parser lexicon. Using two biomedical corpora, we implement and evaluate three approaches to addressing unknown words: automatic lexicon expansion, the use of morphological clues, and disambiguation using a part-of-speech tagger. We evaluate each approach separately for its effect on parsing performance and consider combinations of these approaches.

Results

In addition to a 45% increase in parsing efficiency, we find that the best approach, incorporating information from a domain part-of-speech tagger, offers a statistically significant 10% relative decrease in error.

Conclusion

When available, a high-quality domain part-of-speech tagger is the best solution to unknown word issues in the domain adaptation of a general parser. In the absence of such a resource, surface clues can provide remarkably good coverage and performance when tuned to the domain. The adapted parser is available under an open-source license.
  相似文献   

12.
Pathogenic fungi are a growing health concern worldwide, particularly in large, densely populated cities. The dramatic upsurge of pigeon populations in cities has been implicated in the increased incidence of invasive fungal infections. In this study, we used a culture‐independent, high‐throughput sequencing approach to describe the diversity of clinically relevant fungi (CRF) associated with pigeon faeces and map the relative abundance of CRF across Seoul, Korea. In addition, we tested whether certain geographical, sociological and meteorological factors were significantly associated with the diversity and relative abundance of CRF. Finally, we compared the CRF diversity of fresh and old pigeon faeces to identify the source of the fungi and the role of pigeons in dispersal. Our results demonstrated that both the composition and relative abundance of CRF are unevenly distributed across Seoul. The green area ratio and the number of multiplex houses were positively correlated with species diversity, whereas wind speed and number of households were negatively correlated. The number of workers and green area ratio were positively correlated with the relative abundance of CRF, whereas wind speed was negatively correlated. Because many CRF were absent in fresh faeces, we inferred that most species cannot survive the gastrointestinal tract of pigeons and instead are likely transmitted through soil or air and use pigeon faeces as a substrate for proliferation.  相似文献   

13.
Western-style diet (WSD), which is high in fat and low in fiber, lacks nutrients to support gut microbiota. Consequently, WSD reduces microbiota density and promotes microbiota encroachment, potentially influencing colonization resistance, immune system readiness, and thus host defense against pathogenic bacteria. Here we examined the impact of WSD on infection and colitis in response to Citrobacter rodentium. We observed that, relative to mice consuming standard rodent grain-based chow (GBC), feeding WSD starkly altered the dynamics of Citrobacter infection, reducing initial colonization and inflammation but frequently resulting in persistent infection that associated with low-grade inflammation and insulin resistance. WSD’s reduction in initial Citrobacter virulence appeared to reflect that colons of GBC-fed mice contain microbiota metabolites, including short-chain fatty acids, especially acetate, that drive Citrobacter growth and virulence. Citrobacter persistence in WSD-fed mice reflected inability of resident microbiota to out-compete it from the gut lumen, likely reflecting the profound impacts of WSD on microbiota composition. These studies demonstrate potential of altering microbiota and their metabolites by diet to impact the course and consequence of infection following exposure to a gut pathogen.  相似文献   

14.
Values of the water saturation deficit (WSD) for hydroactive stomatal movements of kale leaves were estimated using the method of transpiration curve analysis. Stomata of young leaves started closing at WSD values of 5 to 6 per cent and were completely closed at 18 to 20 per cent WSD. During maturation and ageing of leaves these WSD values increased to 12.5 and 18 to 23 per cent respectively. Thus the stomatal reaction is more sensitive to changes in WSD in adult leaves than in young ones. After maturation is attained both values decrease. In apparently withering leaves the individual phases of transpiration curves can barely be distinguished, probably for the reason that even under optimal conditions their stomata remain half-closed and at high WSD values an incomplete closing of the aperture occurs. The injured cuticle of withering leaves affects the shape of the transpiration curve as well.  相似文献   

15.
Cognitive functions rely on the extensive use of information stored in the brain, and the searching for the relevant information for solving some problem is a very complex task. Human cognition largely uses biological search engines, and we assume that to study cognitive function we need to understand the way these brain search engines work. The approach we favor is to study multi-modular network models, able to solve particular problems that involve searching for information. The building blocks of these multimodular networks are the context dependent memory models we have been using for almost 20 years. These models work by associating an output to the Kronecker product of an input and a context. Input, context and output are vectors that represent cognitive variables. Our models constitute a natural extension of the traditional linear associator. We show that coding the information in vectors that are processed through association matrices, allows for a direct contact between these memory models and some procedures that are now classical in the Information Retrieval field. One essential feature of context-dependent models is that they are based on the thematic packing of information, whereby each context points to a particular set of related concepts. The thematic packing can be extended to multimodular networks involving input-output contexts, in order to accomplish more complex tasks. Contexts act as passwords that elicit the appropriate memory to deal with a query. We also show toy versions of several ‘neuromimetic’ devices that solve cognitive tasks as diverse as decision making or word sense disambiguation. The functioning of these multimodular networks can be described as dynamical systems at the level of cognitive variables.  相似文献   

16.
17.
18.
Waxes are components of the cuticle covering the aerial organs of plants. Accumulation of waxes has previously been associated with protection against water loss, therefore contributing to drought tolerance. However, not much information is known about the function of individual wax components during water deficit. We studied the role of wax ester synthesis during drought. The wax ester load on Arabidopsis leaves and stems was increased during water deficiency. Expression of three genes, WSD1, WSD6 and WSD7 of the wax ester synthase/diacylglycerol acyltransferase (WS/DGAT or WSD) family was induced during drought, salt stress and abscisic acid treatment. WSD1 has previously been identified as the major wax ester synthase of stems. wsd1 mutants have shown reduced wax ester coverage on leaves and stems during normal or drought condition, while wax ester loads of wsd6, wsd7 and of the wsd6wsd7 double mutant were unchanged. The growth and relative water content of wsd1 plants were compromised during drought, while leaf water loss of wsd1 was increased. Enzyme assays with recombinant proteins expressed in insect cells revealed that WSD6 and WSD7 contain wax ester synthase activity, albeit with different substrate specificity compared with WSD1. WSD6 and WSD7 localize to the endoplasmic reticulum (ER)/Golgi. These results demonstrated that WSD1 is involved in the accumulation of wax esters during drought, while WSD6 and WSD7 might play other specific roles in wax ester metabolism during stress.  相似文献   

19.

Background

Corticotropin-releasing factor (CRF) is typically considered to mediate aversive aspects of stress, fear and anxiety. However, CRF release in the brain is also elicited by natural rewards and incentive cues, raising the possibility that some CRF systems in the brain mediate an independent function of positive incentive motivation, such as amplifying incentive salience. Here we asked whether activation of a limbic CRF subsystem magnifies the increase in positive motivation for reward elicited by incentive cues previously associated with that reward, in a way that might exacerbate cue-triggered binge pursuit of food or other incentives? We assessed the impact of CRF microinjections into the medial shell of nucleus accumbens using a pure incentive version of Pavlovian-Instrumental transfer, a measure specifically sensitive to the incentive salience of reward cues (which it separates from influences of aversive stress, stress reduction, frustration and other traditional explanations for stress-increased behavior). Rats were first trained to press one of two levers to obtain sucrose pellets, and then separately conditioned to associate a Pavlovian cue with free sucrose pellets. On test days, rats received microinjections of vehicle, CRF (250 or 500 ng/0.2 μl) or amphetamine (20 μg/0.2 μl). Lever pressing was assessed in the presence or absence of the Pavlovian cues during a half-hour test.

Results

Microinjections of the highest dose of CRF (500 ng) or amphetamine (20 μg) selectively enhanced the ability of Pavlovian reward cues to trigger phasic peaks of increased instrumental performance for a sucrose reward, each peak lasting a minute or so before decaying after the cue. Lever pressing was not enhanced by CRF microinjections in the baseline absence of the Pavlovian cue or during the presentation without a cue, showing that the CRF enhancement could not be explained as a result of generalized motor arousal, frustration or stress, or by persistent attempts to ameliorate aversive states.

Conclusion

We conclude that CRF in nucleus accumbens shell amplifies positive motivation for cued rewards, in particular by magnifying incentive salience that is attributed to Pavlovian cues previously associated with those rewards. CRF-induced magnification of incentive salience provides a novel explanation as to why stress may produce cue-triggered bursts of binge eating, drug addiction relapse, or other excessive pursuits of rewards.  相似文献   

20.
Improvements in forest fire risk estimation and mapping fire risk zones are vital to reduce the negative impacts of fire and to facilitate planning for the protection of forested areas. This is especially important for places with little previous data on fire history. This paper presents an improved conceptual scheme for the assessment and mapping of fire risk using a Forest Resource Inventory Database, based on four aspects of topographical, human activity, climate, and forest characteristics factors. We selected 12 variables based on our defined conceptual scheme to generate a synthetic forest fire risk index (FRI) to quantify potential forest fire risk and map risk zones in the Wuyishan Scenery District (WSD), a world heritage site that located in the northwest of Fujian province, People's Republic of China. Spatial statistics were used to examine the spatio-temporal variation of FRI. The results showed the main fire risk zones in the WSD were in the low or moderate categories (accounting for 76.7% of the total area of the WSD in 1997 and 79.2% in 2009). The spatial heterogeneity of FRI showed anisotropic variability characteristics which changed over time. From 1997 to 2009, there was an increasing influence from both autocorrelation factors and random factors. Moreover, these factors played almost equally important roles in forest fire processes in the WSD. The fire risk map was applied to assess the vulnerability of cultural heritage resources in the WSD. Most were located in low- or moderate-risk areas, and therefore would be at low risk from potential fire damage.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号