共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
Background
The ability to distinguish between genes and proteins is essential for understanding biological text. Support Vector Machines (SVMs) have been proven to be very efficient in general data mining tasks. We explore their capability for the gene versus protein name disambiguation task. 相似文献4.
Background
Frequently, several alternative names are in use for biological objects such as genes and proteins. Applications like manual literature search, automated text-mining, named entity identification, gene/protein annotation, and linking of knowledge from different information sources require the knowledge of all used names referring to a given gene or protein. Various organism-specific or general public databases aim at organizing knowledge about genes and proteins. These databases can be used for deriving gene and protein name dictionaries. So far, little is known about the differences between databases in terms of size, ambiguities and overlap. 相似文献5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
Background
Several in silico methods exist that were developed to predict protein interactions from the copious amount of genomic and proteomic data. One of these methods is Domain Fusion, which has proven to be effective in predicting functional links between proteins. 相似文献17.
Eiko E Kuramae Vincent Robert Carlos Echavarri-Erasun Teun Boekhout 《BMC evolutionary biology》2007,7(1):134
Background
The construction of robust and well resolved phylogenetic trees is important for our understanding of many, if not all biological processes, including speciation and origin of higher taxa, genome evolution, metabolic diversification, multicellularity, origin of life styles, pathogenicity and so on. Many older phylogenies were not well supported due to insufficient phylogenetic signal present in the single or few genes used in phylogenetic reconstructions. Importantly, single gene phylogenies were not always found to be congruent. The phylogenetic signal may, therefore, be increased by enlarging the number of genes included in phylogenetic studies. Unfortunately, concatenation of many genes does not take into consideration the evolutionary history of each individual gene. Here, we describe an approach to select informative phylogenetic proteins to be used in the Tree of Life (TOL) and barcoding projects by comparing the cophenetic correlation coefficients (CCC) among individual protein distance matrices of proteins, using the fungi as an example. The method demonstrated that the quality and number of concatenated proteins is important for a reliable estimation of TOL. Approximately 40–45 concatenated proteins seem needed to resolve fungal TOL. 相似文献18.
Background
The functions of human cells are carried out by biomolecular networks, which include proteins, genes, and regulatory sites within DNA that encode and control protein expression. Models of biomolecular network structure and dynamics can be inferred from high-throughput measurements of gene and protein expression. We build on our previously developed fuzzy logic method for bridging quantitative and qualitative biological data to address the challenges of noisy, low resolution high-throughput measurements, i.e., from gene expression microarrays. We employ an evolutionary search algorithm to accelerate the search for hypothetical fuzzy biomolecular network models consistent with a biological data set. We also develop a method to estimate the probability of a potential network model fitting a set of data by chance. The resulting metric provides an estimate of both model quality and dataset quality, identifying data that are too noisy to identify meaningful correlations between the measured variables. 相似文献19.
Amelie Fassbender Peter Simsa Cleophas M Kyama Etienne Waelkens Attila Mihalyi Christel Meuleman Olivier Gevaert Raf Van de Plas Bart de Moor Thomas M D'Hooghe 《Reproductive biology and endocrinology : RB&E》2010,8(1):123