首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Gene selection: a Bayesian variable selection approach   总被引:13,自引:0,他引:13  
Selection of significant genes via expression patterns is an important problem in microarray experiments. Owing to small sample size and the large number of variables (genes), the selection process can be unstable. This paper proposes a hierarchical Bayesian model for gene (variable) selection. We employ latent variables to specialize the model to a regression setting and uses a Bayesian mixture prior to perform the variable selection. We control the size of the model by assigning a prior distribution over the dimension (number of significant genes) of the model. The posterior distributions of the parameters are not in explicit form and we need to use a combination of truncated sampling and Markov Chain Monte Carlo (MCMC) based computation techniques to simulate the parameters from the posteriors. The Bayesian model is flexible enough to identify significant genes as well as to perform future predictions. The method is applied to cancer classification via cDNA microarrays where the genes BRCA1 and BRCA2 are associated with a hereditary disposition to breast cancer, and the method is used to identify a set of significant genes. The method is also applied successfully to the leukemia data. SUPPLEMENTARY INFORMATION: http://stat.tamu.edu/people/faculty/bmallick.html.  相似文献   

2.
Based on nearly complete genome sequences from a variety of organisms data on naturally occurring genetic variation on the scale of hundreds of loci to entire genomes have been collected in recent years. In parallel, new statistical tests have been developed to infer evidence of recent positive selection from these data and to localize the target regions of selection in the genome. These methods have now been successfully applied to Drosophila melanogaster , humans, mice and a few plant species. In genomic regions of normal recombination rates, the targets of positive selection have been mapped down to the level of individual genes.  相似文献   

3.
MOTIVATION: Anchoring of proteins to the extracytosolic leaflet of membranes via C-terminal attachment of glycosylphosphatidylinositol (GPI) is ubiquitous and essential in eukaryotes. The signal for GPI-anchoring is confined to the C-terminus of the target protein. In order to identify anchoring signals in silico, we have trained neural networks on known GPI-anchored proteins, systematically optimizing input parameters. RESULTS: A Kohonen self-organizing map, GPI-SOM, was developed that predicts GPI-anchored proteins with high accuracy. In combination with SignalP, GPI-SOM was used in genome-wide surveys for GPI-anchored proteins in diverse eukaryotes. Apart from specialized parasites, a general trend towards higher percentages of GPI-anchored proteins in larger proteomes was observed. AVAILABILITY: GPI-SOM is accessible on-line at http://gpi.unibe.ch. The source code (written in C) is available on the same website. SUPPLEMENTARY INFORMATION: Positive training set, performance test sets and lists of predicted GPI-anchored proteins from different eukaryotes in fasta format.  相似文献   

4.
Computer recognition of short frnctional sites on DNA, suchas promoter regions or intron—exon boundaries, has recentlyattracted much interest. In this paper we have focused our attentionon the automatic recognition of relevant features of human nucleicacid sequences by means of an unsupervised artificial neuralnetwork model. Sixty messenger RNA and 31 genomic DNA sequenceswere analysed. The results showed that in mRNA, the minimalsimilarity 60 base pattern was guanine-and cytosine-rich andlocated in most sequences in a range of 250 bases from eitherthe middle point of the signal peptide coding region or fromthe start of the coding region. On DNA sequences a region definedby a cluster of minimal similarity patterns was present in manyof the analysed genes. This zone may be related to alternativesplicing and DNA methylation.  相似文献   

5.
 A new self-organizing map (SOM) architecture called the ASSOM (adaptive-subspace SOM) is shown to create sets of translation-invariant filters when randomly displaced or moving input patterns are used as training data. No analytical functional forms for these filters are thereby postulated. Different kinds of filters are formed by the ASSOM when pictures are rotated during learning, or when they are zoomed. The ASSOM can thus act as a learning feature-extraction stage for pattern recognizers, being able to adapt to many sensory environments and to many different transformation groups of patterns. Received: 14 September 1995 / Accepted in revised form: 8 May 1996  相似文献   

6.
SUMMARY: INteractive Codon usage Analysis (INCA) provides an array of features useful in analysis of synonymous codon usage in whole genomes. In addition to computing codon frequencies and several usage indices, such as 'codon bias', effective Nc and CAI, the primary strength of INCA has numerous options for the interactive graphical display of calculated values, thus allowing visual detection of various trends in codon usage. Finally, INCA includes a specific unsupervised neural network algorithm, the self-organizing map, used for gene clustering according to the preferred utilization of codons. AVAILABILITY: INCA is available for the Win32 platform and is free of charge for academic use. For details, visit the web page http://www.bioinfo-hr.org/inca or contact the author directly. SUPPLEMENTARY INFORMATION: Software is accompanied with a user manual and a short tutorial.  相似文献   

7.
Monte Carlo feature selection for supervised classification   总被引:4,自引:0,他引:4  
MOTIVATION: Pre-selection of informative features for supervised classification is a crucial, albeit delicate, task. It is desirable that feature selection provides the features that contribute most to the classification task per se and which should therefore be used by any classifier later used to produce classification rules. In this article, a conceptually simple but computer-intensive approach to this task is proposed. The reliability of the approach rests on multiple construction of a tree classifier for many training sets randomly chosen from the original sample set, where samples in each training set consist of only a fraction of all of the observed features. RESULTS: The resulting ranking of features may then be used to advantage for classification via a classifier of any type. The approach was validated using Golub et al. leukemia data and the Alizadeh et al. lymphoma data. Not surprisingly, we obtained a significantly different list of genes. Biological interpretation of the genes selected by our method showed that several of them are involved in precursors to different types of leukemia and lymphoma rather than being genes that are common to several forms of cancers, which is the case for the other methods. AVAILABILITY: Prototype available upon request.  相似文献   

8.
9.
Xiao L  Wang K  Teng Y  Zhang J 《FEBS letters》2003,540(1-3):117-124
Wheat gliadin and other cereal prolamins have been said to be involved in the pathogenic damage of the small intestine in celiac disease via the apoptosis of epithelial cells. In the present work we investigated the mechanisms underlying the pro-apoptotic activity exerted by gliadin-derived peptides in Caco-2 intestinal cells, a cell line which retains many morphological and enzymatic features typical of normal human enterocytes. We found that digested peptides from wheat gliadins (i) induce apoptosis by the CD95/Fas apoptotic pathway, (ii) induce increased Fas and FasL mRNA levels, (iii) determine increased FasL release in the medium, and (iv) that gliadin digest-induced apoptosis can be blocked by Fas cascade blocking agents, i.e. targeted neutralizing antibodies. This favors the hypothesis that gliadin could activate an autocrine/paracrine Fas-mediated cell death pathway. Finally, we found that (v) a small peptide (1157 Da) from durum wheat, previously proposed for clinical practice, exerted a powerful protective activity against gliadin digest cytotoxicity.  相似文献   

10.
The two dimensional movement tracks of STAT92E06346 mutant and two control strains (Oregon red (OR) and TM3) of Drosophila melonogaster were continuously observed with image processors. Subsequently Self-Organizing Map (SOM) was implemented to patterning of responding behaviors of the tested specimens. Movement behaviors were accordingly revealed in different strains and sex. SOM showed difference in degree of grouping in behaviors in different genotypes. Visualization through SOM further characterized the clusters of specimens with the variables regarding activities and spatial information. The study demonstrated that techniques in data mining in artificial neural networks could be a useful tool for analyzing complex behaviors induced by changes in genetic information.  相似文献   

11.
This paper presents an approach to the well-known Travelling Salesman Problem (TSP) using Self-Organizing Maps (SOM). The SOM algorithm has interesting topological information about its neurons configuration on cartesian space, which can be used to solve optimization problems. Aspects of initialization, parameters adaptation, and complexity analysis of the proposed SOM based algorithm are discussed. The results show an average deviation of 3.7% from the optimal tour length for a set of 12 TSP instances.  相似文献   

12.
13.
Cortical maps of orientation preference in cats, ferrets and monkeys contain numerous half-rotation point singularities. Experimental data have shown that direction preference also has a smooth representation in these maps, with preferences being for the most part orthogonal to the axis of preferred orientation. As a result, the orientation singularities induce an extensive set of linear fractures in the direction map. These fractures run between and connect nearby point orientation singularities. Their existence appears to pose a puzzle for theories that postulate that cortical maps maximize continuity of representation, because the fractures could be avoided if the orientation map contained full-rotation singularities. Here we show that a dimension-reduction model of cortical map formation, which implements principles of continuity and completeness, produces an arrangement of linear direction fractures connecting point orientation singularities which is similar to that observed experimentally. We analyse the behaviour of this model and suggest reasons why the model maps contain half-rotation rather than full-rotation orientation singularities.  相似文献   

14.
Macrofungal communities were investigated in four associations of xerothermic swards: Festucetum pallentis, Origano-Brachypodietum, Adonido-Brachypodietum pinnati and Diantho-Armerietum elongatae in a Jurassic area of the Częstochowa Upland (southern Poland). A total of 47 species were recorded. The self-organising map (SOM)—an unsupervised algorithm for artificial neural networks—was used to recognise patterns in the macrofungal communities of diverse xerothermic swards. Only two associations were mycologically similar: Origano-Brachypodietum and Adonido-Brachypodietum pinnati. Species with high and significant IndVal (the species indicator value) for each investigated phytocoenoses are presented. The presence of macrofungal species and the participation of indicator species were connected with habitat factors of plant associations, as documented by the IndVal application. In the least fertile phytocoenoses, macrofungal communities were poor with few indicator species. The more fertile phytocoenoses had richer and more varied communities of macrofungi with higher numbers of indicator species. The ordering methods applied in this study were very effective for analyzing the macrofungal communities existing in plant associations.  相似文献   

15.
Rowland JJ 《Bio Systems》2003,72(1-2):187-196
The expressive power, powerful search capability, and the explicit nature of the resulting models make evolutionary methods very attractive for supervised learning applications in bioinformatics. However, their characteristics also make them highly susceptible to overtraining or to discovering chance relationships in the data. Identification of appropriate criteria for terminating evolution and for selecting an appropriately validated model is vital. Some approaches that are commonly applied to other modelling methods are not necessarily applicable in a straightforward manner to evolutionary methods. An approach to model selection is presented that is not unduly computationally intensive. To illustrate the issues and the technique two bioinformatic datasets are used, one relating to metabolite determination and the other to disease prediction from gene expression data.  相似文献   

16.
17.
18.
19.
Selection mapping applies the population genetics theory of hitchhiking to the localization of genomic regions containing genes under selection. This approach predicts that neutral loci linked to genes under positive selection will have reduced diversity due to their shared history with a selected locus, and thus, genome scans of diversity levels can be used to identify regions containing selected loci. Most previous approaches to this problem ignore the spatial genomic pattern of diversity expected under selection. The regression-based approach advocated in this paper takes into account the expected pattern of decreasing genetic diversity with increased proximity to a selected locus. Simulated data are used to examine the patterns of diversity under different scenarios, in order to assess the power of a regression-based approach to the identification of regions under selection. Application of this method to both simulated and empirical data demonstrates its potential to detect selection. In contrast to some other methods, the regression approach described in this paper can be applied to any marker type. Results also suggest that this approach may give more precise estimates of the location of the selected locus than alternative methods, although the power is slightly lower in some cases.  相似文献   

20.
Abstract  The prioritisation of potential agents on the basis of likely efficacy is an important step in biological control because it can increase the probability of a successful biocontrol program, and reduce risks and costs. In this introductory paper we define success in biological control, review how agent selection has been approached historically, and outline the approach to agent selection that underpins the structure of this special issue on agent selection. Developing criteria by which to judge the success of a biocontrol agent (or program) provides the basis for agent selection decisions. Criteria will depend on the weed, on the ecological and management context in which that weed occurs, and on the negative impacts that biocontrol is seeking to redress. Predicting which potential agents are most likely to be successful poses enormous scientific challenges. 'Rules of thumb', 'scoring systems' and various conceptual and quantitative modelling approaches have been proposed to aid agent selection. However, most attempts have met with limited success due to the diversity and complexity of the systems in question. This special issue presents a series of papers that deconstruct the question of agent choice with the aim of progressively improving the success rate of biological control. Specifically they ask: (i) what potential agents are available and what should we know about them? (ii) what type, timing and degree of damage is required to achieve success? and (iii) which potential agent will reach the necessary density, at the right time, to exert the required damage in the target environment?  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号