首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 578 毫秒
1.
2.
Tibetan web pages appear enormously. It is meaningful that the information processing technology is utilized to find the useful knowledge from the Tibetan web information. Tibetan semantic ontology can enrich the Tibetan digital resource and is helpful to improve the information processing performance. In this paper, semantic classification of Tibetan network corpus is studied. Firstly Tibetan web pages are collected. Secondly preprocessing is conducted to extract the useful information from Web pages. Thirdly the word segmentation and text representation are introduced. Finally the text similarity classification algorithm is proposed to classify the text. During the experiment, the comparison between semantic classification and non semantic classification is conducted. The results show that the semantic classification performance is obviously superior to non semantic classification. This means that making full use of ontology semantic relationship can greatly enhance the classification accuracy. The research is useful and helpful to the study of Tibetan semantic information processing.  相似文献   

3.
We have conducted a study on the long-term availability of bioinformatics Web services: an observation of 927 Web services published in the annual Nucleic Acids Research Web Server Issues between 2003 and 2009. We found that 72% of Web sites are still available at the published addresses, only 9% of services are completely unavailable. Older addresses often redirect to new pages. We checked the functionality of all available services: for 33%, we could not test functionality because there was no example data or a related problem; 13% were truly no longer working as expected; we could positively confirm functionality only for 45% of all services. Additionally, we conducted a survey among 872 Web Server Issue corresponding authors; 274 replied. 78% of all respondents indicate their services have been developed solely by students and researchers without a permanent position. Consequently, these services are in danger of falling into disrepair after the original developers move to another institution, and indeed, for 24% of services, there is no plan for maintenance, according to the respondents. We introduce a Web service quality scoring system that correlates with the number of citations: services with a high score are cited 1.8 times more often than low-scoring services. We have identified key characteristics that are predictive of a service's survival, providing reviewers, editors, and Web service developers with the means to assess or improve Web services. A Web service conforming to these criteria receives more citations and provides more reliable service for its users. The most effective way of ensuring continued access to a service is a persistent Web address, offered either by the publishing journal, or created on the authors' own initiative, for example at http://bioweb.me. The community would benefit the most from a policy requiring any source code needed to reproduce results to be deposited in a public repository.  相似文献   

4.
This paper proposes a system named AWSCS (Automatic Web Service Composition System) to evaluate different approaches for automatic composition of Web services, based on QoS parameters that are measured at execution time. The AWSCS is a system to implement different approaches for automatic composition of Web services and also to execute the resulting flows from these approaches. Aiming at demonstrating the results of this paper, a scenario was developed, where empirical flows were built to demonstrate the operation of AWSCS, since algorithms for automatic composition are not readily available to test. The results allow us to study the behaviour of running composite Web services, when flows with the same functionality but different problem-solving strategies were compared. Furthermore, we observed that the influence of the load applied on the running system as the type of load submitted to the system is an important factor to define which approach for the Web service composition can achieve the best performance in production.  相似文献   

5.

Over the last decades, web services are used for performing specific tasks demanded by users. The most important task of service’s classification system is to match an anonymous input service with the stored pre-classified web services. The most challenging issue is that web services are currently organized and classified according to syntax while the context of the requested service is ignored. Due to this motivation, Cloud-based Classification Methodology is proposed as it presents a new methodology based on semantic web service’s classification. Furthermore, cloud computing is used for not only storing but also allocating the high scale of web services with both high availability and accessibility. Fog technology is employed to reduce the latency and to speed up response time. The experimental results using the suggested methodology show a better performance of the proposed system regarding both precision and accuracy in comparison with most of the methods discussed in the literature of the current study.

  相似文献   

6.
In this paper, we propose bounding models, which provide upper and lower bounds on response time in composite Web service model, for alleviating the state explosion problem. The considered models have heterogeneous servers and the number of elementary Web services can be very large. More precisely, we study two types of composite Web services. First, we investigate the performance of a single composite Web service execution instance. Second, this assumption is relaxed (i.e. multiple composite Web services execution instances are considered). These models allows to find trade-off between the accuracy of the bounds and the computation complexity.  相似文献   

7.
Bioinformatics activities are growing all over the world, with proliferation of data and tools. This brings new challenges: how to understand and organize these resources and how to provide interoperability among tools to achieve a given goal. We defined and implemented a framework to help meet some of these challenges. Four issues were considered: the use of Web services as a basic unit, the notion of a Semantic Web to improve interoperability at the syntactic and semantic levels, and the use of scientific workflows to coordinate services to be executed, including their interdependencies and service orchestration.  相似文献   

8.
MOTIVATION: In microarray studies, numerous tools are available for functional enrichment analysis based on GO categories. Most of these tools, due to their requirement of a prior threshold for designating genes as differentially expressed genes (DEGs), are categorized as threshold-dependent methods that often suffer from a major criticism on their changing results with different thresholds. RESULTS: In the present article, by considering the inherent correlation structure of the GO categories, a continuous measure based on semantic similarity of GO categories is proposed to investigate the functional consistence (or stability) of threshold-dependent methods. The results from several datasets show when simply counting overlapping categories between two groups, the significant category groups selected under different DEG thresholds are seemingly very different. However, based on the semantic similarity measure proposed in this article, the results are rather functionally consistent for a wide range of DEG thresholds. Moreover, we find that the functional consistence of gene lists ranked by SAM metric behaves relatively robust against changing DEG thresholds. AVAILABILITY: Source code in R is available on request from the authors.  相似文献   

9.
Several semantic Web Services clients for Bioinformatics have been released, but to date no support systems for service providers have been described. We have created a framework ('MobyServlet') that very simply allows an existing Java application to conform to the MOBY-S semantic Web Services protocol. Using an existing Java program for codon-pair bias determination as an example, we enumerate the steps required for MOBY-S compliance. With minimal programming effort, such a deployment has the advantages of: (1) wider exposure to the user community by automatic inclusion in all MOBY-S client programs and (2) automatic interoperability with other MOBY-S services for input and output. Complex on-line analysis will become easier for biologists as more developers adopt MOBY-S. AVAILABILITY: The framework and documentation are freely available from the Java developer's section of http://www.biomoby.org/.  相似文献   

10.
MOTIVATION: Computationally, in silico experiments in biology are workflows describing the collaboration of people, data and methods. The Grid and Web services are proposed to be the next generation infrastructure supporting the deployment of bioinformatics workflows. But the growing number of autonomous and heterogeneous services pose challenges to the used middleware w.r.t. composition, i.e. discovery and interoperability of services required within in silico experiments. In the IRIS project, we handle the problem of service interoperability by a semi-automatic procedure for identifying and placing customizable adapters into workflows built by service composition. RESULTS: We show the effectiveness and robustness of the software-aided composition procedure by a case study in the field of life science. In this study we combine different database services with different analysis services with the objective of discovering required adapters. Our experiments show that we can identify relevant adapters with high precision and recall.  相似文献   

11.
生态系统服务评估是地理学与生态学研究的重要议题。目前的生态系统服务评估研究主要着眼于服务供给能力的退化风险评估,缺乏将服务的供需匹配特征、供需动态变化趋势以及服务之间的权衡协同关系综合考虑的生态系统服务风险评价框架及案例研究。首先通过系统地梳理生态系统服务供需风险评估研究进展,提出了区域尺度的生态系统服务供需风险研究框架,并以陕西省产水服务为例进行案例分析,揭示其供需风险时空格局变化特征。讨论了生态系统服务供需风险评估的影响因素、研究意义和未来研究方向。结果显示:①在产水服务供需时空格局方面,2000—2010年陕西省产水服务供给与需求总量均有所增加。陕南地区是主要的产水服务供给区,而产水服务需求量较大的地区主要分布在关中地区和汉中盆地。②在产水服务供需匹配方面,2010年陕西省产水服务供给不能满足需求的区域相对于2000年减少3.93%,供需空间匹配状况整体有所改善。③在产水服务供需风险方面,2000—2010年陕西省产水服务供需高风险区域占全省的13.37%,主要分布在关中地区和榆林市;低风险区域占全省的86.63%,主要分布在陕南地区以及延安市和宝鸡市。与2000—2005年相比,2005—2010年陕西省产水服务供需风险水平明显降低,高风险区域比重减少1.63%。研究结果以期为生态系统服务风险评估研究与管理应用提供理论支撑。  相似文献   

12.
Gene/protein recognition and normalization is an important preliminary step for many biological text mining tasks. In this paper, we present a multistage gene normalization system which consists of four major subtasks: pre-processing, dictionary matching, ambiguity resolution and filtering. For the first subtask, we apply the gene mention tagger developed in our earlier work, which achieves an F-score of 88.42% on the BioCreative II GM testing set. In the stage of dictionary matching, the exact matching and approximate matching between gene names and the EntrezGene lexicon have been combined. For the ambiguity resolution subtask, we propose a semantic similarity disambiguation method based on Munkres'' Assignment Algorithm. At the last step, a filter based on Wikipedia has been built to remove the false positives. Experimental results show that the presented system can achieve an F-score of 90.1%, outperforming most of the state-of-the-art systems.  相似文献   

13.
Cloud computing technology plays a very important role in many areas, such as in the construction and development of the smart city. Meanwhile, numerous cloud services appear on the cloud-based platform. Therefore how to how to select trustworthy cloud services remains a significant problem in such platforms, and extensively investigated owing to the ever-growing needs of users. However, trust relationship in social network has not been taken into account in existing methods of cloud service selection and recommendation. In this paper, we propose a cloud service selection model based on the trust-enhanced similarity. Firstly, the direct, indirect, and hybrid trust degrees are measured based on the interaction frequencies among users. Secondly, we estimate the overall similarity by combining the experience usability measured based on Jaccard’s Coefficient and the numerical distance computed by Pearson Correlation Coefficient. Then through using the trust degree to modify the basic similarity, we obtain a trust-enhanced similarity. Finally, we utilize the trust-enhanced similarity to find similar trusted neighbors and predict the missing QoS values as the basis of cloud service selection and recommendation. The experimental results show that our approach is able to obtain optimal results via adjusting parameters and exhibits high effectiveness. The cloud services ranking by our model also have better QoS properties than other methods in the comparison experiments.  相似文献   

14.
Optimal spliced alignment of homologous cDNA to a genomic DNA template   总被引:17,自引:0,他引:17  
MOTIVATION: Supplementary cDNA or EST evidence is often decisive for discriminating between alternative gene predictions derived from computational sequence inspection by any of a number of requisite programs. Without additional experimental effort, this approach must rely on the occurrence of cognate ESTs for the gene under consideration in available, generally incomplete, EST collections for the given species. In some cases, particular exon assignments can be supported by sequence matching even if the cDNA or EST is produced from non-cognate genomic DNA, including different loci of a gene family or homologous loci from different species. However, marginally significant sequence matching alone can also be misleading. We sought to develop an algorithm that would simultaneously score for predicted intrinsic splice site strength and sequence matching between the genomic DNA template and a related cDNA or EST. In this case, weakly predicted splice sites may be chosen for the optimal scoring spliced alignment on the basis of surrounding sequence matching. Strongly predicted splice sites will enter the optimal spliced alignment even without strong sequence matching. RESULTS: We designed a novel algorithm that produces the optimal spliced alignment of a genomic DNA with a cDNA or EST based on scoring for both sequence matching and intrinsic splice site strength. By example, we demonstrate that this combined approach appears to improve gene prediction accuracy compared with current methods that rely only on either search by content and signal or on sequence similarity. AVAILABILITY: The algorithm is available as a C subroutine and is implemented in the SplicePredictor and GeneSeqer programs. The source code is available via anonymous ftp from ftp. zmdb.iastate.edu. Both programs are also implemented as a Web service at http://gremlin1.zool.iastate.edu/cgi-bin/s p.cgiand http://gremlin1.zool.iastate.edu/cgi-bin/g s.cgi, respectively. CONTACT: vbrendel@iastate.edu  相似文献   

15.
MOTIVATION: The inference of genes that are truly associated with inherited human diseases from a set of candidates resulting from genetic linkage studies has been one of the most challenging tasks in human genetics. Although several computational approaches have been proposed to prioritize candidate genes relying on protein-protein interaction (PPI) networks, these methods can usually cover less than half of known human genes. RESULTS: We propose to rely on the biological process domain of the gene ontology to construct a gene semantic similarity network and then use the network to infer disease genes. We show that the constructed network covers about 50% more genes than a typical PPI network. By analyzing the gene semantic similarity network with the PPI network, we show that gene pairs tend to have higher semantic similarity scores if the corresponding proteins are closer to each other in the PPI network. By analyzing the gene semantic similarity network with a phenotype similarity network, we show that semantic similarity scores of genes associated with similar diseases are significantly different from those of genes selected at random, and that genes with higher semantic similarity scores tend to be associated with diseases with higher phenotype similarity scores. We further use the gene semantic similarity network with a random walk with restart model to infer disease genes. Through a series of large-scale leave-one-out cross-validation experiments, we show that the gene semantic similarity network can achieve not only higher coverage but also higher accuracy than the PPI network in the inference of disease genes.  相似文献   

16.
17.
The current increase in Gene Ontology (GO) annotations of proteins in the existing genome databases and their use in different analyses have fostered the improvement of several biomedical and biological applications. To integrate this functional data into different analyses, several protein functional similarity measures based on GO term information content (IC) have been proposed and evaluated, especially in the context of annotation-based measures. In the case of topology-based measures, each approach was set with a specific functional similarity measure depending on its conception and applications for which it was designed. However, it is not clear whether a specific functional similarity measure associated with a given approach is the most appropriate, given a biological data set or an application, i.e., achieving the best performance compared to other functional similarity measures for the biological application under consideration. We show that, in general, a specific functional similarity measure often used with a given term IC or term semantic similarity approach is not always the best for different biological data and applications. We have conducted a performance evaluation of a number of different functional similarity measures using different types of biological data in order to infer the best functional similarity measure for each different term IC and semantic similarity approach. The comparisons of different protein functional similarity measures should help researchers choose the most appropriate measure for the biological application under consideration.  相似文献   

18.
Content-Aware Dispatching Algorithms for Cluster-Based Web Servers   总被引:1,自引:0,他引:1  
Cluster-based Web servers are leading architectures for highly accessed Web sites. The most common Web cluster architecture consists of replicated server nodes and a Web switch that routes client requests among the nodes. In this paper, we consider content-aware Web switches that can use application level information to assign client requests. We evaluate the performance of some representative state-of-the-art dispatching algorithms for Web switches operating at layer 7 of the OSI protocol stack. Specifically, we consider dispatching algorithms that use only client information as well as the combination of client and server information for load sharing, reference locality or service partitioning. We demonstrate through a wide set of simulation experiments that dispatching policies aiming to improve locality in server caches give best results for traditional Web publishing sites providing static information and some simple database searches. On the other hand, when we consider more recent Web sites providing dynamic and secure services, dispatching policies that aim to share the load are the most effective.  相似文献   

19.
The burden of non-interoperability between on-line genomic resources is increasingly the rate-limiting step in large-scale genomic analysis. BioMOBY is a biological Web Service interoperability initiative that began as a retreat of representatives from the model organism database community in September, 2001. Its long-term goal is to provide a simple, extensible platform through which the myriad of on-line biological databases and analytical tools can offer their information and analytical services in a fully automated and interoperable way. Of the two branches of the larger BioMOBY project, the Web Services branch (MOBY-S) has now been deployed over several dozen data sources worldwide, revealing some significant observations about the nature of the integrative biology problem; in particular, that Web Service interoperability in the domain of bioinformatics is, unexpectedly, largely a syntactic rather than a semantic problem. That is to say, interoperability between bioinformatics Web Services can be largely achieved simply by specifying the data structures being passed between the services (syntax) even without rich specification of what those data structures mean (semantics). Thus, one barrier of the integrative problem has been overcome with a surprisingly simple solution. Here, we present a non-technical overview of the critical components that give rise to the interoperable behaviors seen in MOBY-S and discuss an exemplar case, the PlaNet consortium, where MOBY-S has been deployed to integrate the on-line plant genome databases and analytical services provided by a European consortium of databases and data service providers.  相似文献   

20.
The rapid growth of published cloud services in the Internet makes the service selection and recommendation a challenging task for both users and service providers. In cloud environments, software re services collaborate with other complementary services to provide complete solutions to end users. The service selection is performed based on QoS requirements submitted by end users. Software providers alone cannot guarantee users’ QoS requirements. These requirements must be end-to-end, representing all collaborating services in a cloud solution. In this paper, we propose a prediction model to compute end-to-end QoS values for vertically composed services which are composed of three types of cloud services: software (SaaS), infrastructure (IaaS) and data (DaaS) services. These values can be used during the service selection and recommendation process. Our model exploits historical QoS values and cloud service and user information to predict unknown end-to-end QoS values of composite services. The experiments demonstrate that our proposed model outperforms other prediction models in terms of the prediction accuracy. We also study the impact of different parameters on the prediction results. In the experiments, we used real cloud services’ QoS data collected using our developed QoS monitoring and collecting system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号