首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Recently, supervised learning methods have been exploited to reconstruct gene regulatory networks from gene expression data. The reconstruction of a network is modeled as a binary classification problem for each pair of genes. A statistical classifier is trained to recognize the relationships between the activation profiles of gene pairs. This approach has been proven to outperform previous unsupervised methods. However, the supervised approach raises open questions. In particular, although known regulatory connections can safely be assumed to be positive training examples, obtaining negative examples is not straightforward, because definite knowledge is typically not available that a given pair of genes do not interact.  相似文献   

2.
3.
Chen Y  Li Z  Wang X  Feng J  Hu X 《BMC genomics》2010,11(Z2):S11

Background

A large amount of functional genomic data have provided enough knowledge in predicting gene function computationally, which uses known functional annotations and relationship between unknown genes and known ones to map unknown genes to GO functional terms. The prediction procedure is usually formulated as binary classification problem. Training binary classifier needs both positive examples and negative ones that have almost the same size. However, from various annotation database, we can only obtain few positive genes annotation for most offunctional terms, that is, there are only few positive examples for training classifier, which makes predicting directly gene function infeasible.

Results

We propose a novel approach SPE_RNE to train classifier for each functional term. Firstly, positive examples set is enlarged by creating synthetic positive examples. Secondly, representative negative examples are selected by training SVM(support vector machine) iteratively to move classification hyperplane to a appropriate place. Lastly, an optimal SVM classifier are trained by using grid search technique. On combined kernel ofYeast protein sequence, microarray expression, protein-protein interaction and GO functional annotation data, we compare SPE_RNE with other three typical methods in three classical performance measures recall R, precise P and their combination F: twoclass considers all unlabeled genes as negative examples, twoclassbal selects randomly same number negative examples from unlabeled gene, PSoL selects a negative examples set that are far from positive examples and far from each other.

Conclusions

In test data and unknown genes data, we compute average and variant of measure F. The experiments showthat our approach has better generalized performance and practical prediction capacity. In addition, our method can also be used for other organisms such as human.
  相似文献   

4.

Background  

Differentially expressed genes are typically identified by analyzing the variation between replicate measurements. These procedures implicitly assume that there are no systematic errors in the data even though several sources of systematic error are known.  相似文献   

5.

Background  

Many k-mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds.  相似文献   

6.

Background  

The number of genes declared differentially expressed is a random variable and its variability can be assessed by resampling techniques. Another important stability indicator is the frequency with which a given gene is selected across subsamples. We have conducted studies to assess stability and some other properties of several gene selection procedures with biological and simulated data.  相似文献   

7.
PineappleDB: An online pineapple bioinformatics resource   总被引:1,自引:0,他引:1  

Background  

A world first pineapple EST sequencing program has been undertaken to investigate genes expressed during non-climacteric fruit ripening and the nematode-plant interaction during root infection. Very little is known of how non-climacteric fruit ripening is controlled or of the molecular basis of the nematode-plant interaction. PineappleDB was developed to provide the research community with access to a curated bioinformatics resource housing the fruit, root and nematode infected gall expressed sequences.  相似文献   

8.
MicroRNA targets in Drosophila   总被引:3,自引:0,他引:3  

Background  

The recent discoveries of microRNA (miRNA) genes and characterization of the first few target genes regulated by miRNAs in Caenorhabditis elegans and Drosophila melanogaster have set the stage for elucidation of a novel network of regulatory control. We present a computational method for whole-genome prediction of miRNA target genes. The method is validated using known examples. For each miRNA, target genes are selected on the basis of three properties: sequence complementarity using a position-weighted local alignment algorithm, free energies of RNA-RNA duplexes, and conservation of target sites in related genomes. Application to the D. melanogaster, Drosophila pseudoobscura and Anopheles gambiae genomes identifies several hundred target genes potentially regulated by one or more known miRNAs.  相似文献   

9.

Background  

Olea europaea L. is a traditional tree crop of the Mediterranean basin with a worldwide economical high impact. Differently from other fruit tree species, little is known about the physiological and molecular basis of the olive fruit development and a few sequences of genes and gene products are available for olive in public databases. This study deals with the identification of large sets of differentially expressed genes in developing olive fruits and the subsequent computational annotation by means of different software.  相似文献   

10.

Background  

The goal of most microarray studies is either the identification of genes that are most differentially expressed or the creation of a good classification rule. The disadvantage of the former is that it ignores the importance of gene interactions; the disadvantage of the latter is that it often does not provide a sufficient focus for further investigation because many genes may be included by chance. Our strategy is to search for classification rules that perform well with few genes and, if they are found, identify genes that occur relatively frequently under multiple random validation (random splits into training and test samples).  相似文献   

11.

Background  

The chicken avidin gene family consists of avidin and several avidin related genes (AVRs). Of these gene products, avidin is the best characterized and is known for its extremely high affinity for D-biotin, a property that is utilized in numerous modern life science applications. Recently, the AVR genes have been expressed as recombinant proteins, which have shown different biotin-binding properties as compared to avidin.  相似文献   

12.

Background  

We present a novel method of protein fold decoy discrimination using machine learning, more specifically using neural networks. Here, decoy discrimination is represented as a machine learning problem, where neural networks are used to learn the native-like features of protein structures using a set of positive and negative training examples. A set of native protein structures provides the positive training examples, while negative training examples are simulated decoy structures obtained by reversing the sequences of native structures. Various features are extracted from the training dataset of positive and negative examples and used as inputs to the neural networks.  相似文献   

13.

Background  

Riboswitches are RNA elements in the 5' untranslated leaders of bacterial mRNAs that directly sense the levels of specific metabolites with a structurally conserved aptamer domain to regulate expression of downstream genes. Riboswitches are most common in the genomes of low GC Gram-positive bacteria (for example, Bacillus subtilis contains examples of all known riboswitches), and some riboswitch classes seem to be restricted to this group.  相似文献   

14.

Background  

A microarray study may select different differentially expressed gene sets because of different selection criteria. For example, the fold-change and p-value are two commonly known criteria to select differentially expressed genes under two experimental conditions. These two selection criteria often result in incompatible selected gene sets. Also, in a two-factor, say, treatment by time experiment, the investigator may be interested in one gene list that responds to both treatment and time effects.  相似文献   

15.
16.

Background  

CatSper1-4 are a unique family of sperm cation channels, which are exclusively expressed in the testis and play an important role in sperm motility and male fertility. Despite their vital role in male fertility, almost nothing is known about the factors regulating their expression. Here, we investigated the effects of selenium (Se) on the expression of CatSper genes and sperm parameters in aging versus young male mice.  相似文献   

17.

Background  

In contrast to the majority of mammalian genes, imprinted genes are monoallelically expressed with the choice of the active allele depending on its parental origin. Due to their special inheritance patterns, maternally and paternally expressed genes might be under different evolutionary pressure. Here, we aimed at assessing the evolutionary history of imprinted genes.  相似文献   

18.

Background  

The inflorescence of the cut-flower crop Gerbera hybrida (Asteraceae) consists of two principal flower types, ray and disc, which form a tightly packed head, or capitulum. Despite great interest in plant morphological evolution and the tractability of the gerbera system, very little is known regarding genetic mechanisms involved in flower type specification. Here, we provide comparative staging of ray and disc flower development and microarray screening for differentially expressed genes, accomplished via microdissection of hundreds of coordinately developing flower primordia.  相似文献   

19.

Background  

We are interested in understanding the locational distribution of genes and their functions in genomes, as this distribution has both functional and evolutionary significance. Gene locational distribution is known to be affected by various evolutionary processes, with tandem duplication thought to be the main process producing clustering of homologous sequences. Recent research has found clustering of protein structural families in the human genome, even when genes identified as tandem duplicates have been removed from the data. However, this previous research was hindered as they were unable to analyse small sample sizes. This is a challenge for bioinformatics as more specific functional classes have fewer examples and conventional statistical analyses of these small data sets often produces unsatisfactory results.  相似文献   

20.

Background  

Although homeobox genes have been the subject of many studies, little is known about the main amino acid changes that occurred early in the evolution of genes belonging to different classes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号