首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 381 毫秒
1.
 In this paper, we propose a modification of Kohonen's self-organization map (SOM) algorithm. When the input signal space is not convex, some reference vectors of SOM can protrude from it. The input signal space must be convex to keep all the reference vectors fixed on it for any updates. Thus, we introduce a projection learning method that fixes the reference vectors onto the input signal space. This version of SOM can be applied to a non-convex input signal space. We applied SOM with projection learning to a direction map observed in the primary visual cortex of area 17 of ferrets, and area 18 of cats. Neurons in those areas responded selectively to the orientation of edges or line segments, and their directions of motion. Some iso-orientation domains were subdivided into selective regions for the opposite direction of motion. The abstract input signal space of the direction map described in the manner proposed by Obermayer and Blasdel [(1993) J Neurosci 13: 4114–4129] is not convex. We successfully used SOM with projection learning to reproduce a direction-orientation joint map. Received: 29 September 2000 / Accepted: 7 March 2001  相似文献   

2.
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.  相似文献   

3.
Clean absorption mode NMR data acquisition is presented based on mirrored time domain sampling and widely used time-proportional phase incrementation (TPPI) for quadrature detection. The resulting NMR spectra are devoid of dispersive frequency domain peak components. Those peak components exacerbate peak identification and shift peak maxima, and thus impede automated spectral analysis. The new approach is also of unique value for obtaining clean absorption mode reduced-dimensionality projection NMR spectra, which can rapidly provide high-dimensional spectral information for high-throughput NMR structure determination.  相似文献   

4.
We introduce an unsupervised competitive learning rule, called the extended Maximum Entropy learning Rule (eMER), for topographic map formation. Unlike Kohonen's Self-Organizing Map (SOM) algorithm, the presence of a neighborhood function is not a prerequisite for achieving topology-preserving mappings, but instead it is intended: (1) to speed up the learning process and (2) to perform nonparametric regression. We show that, when the neighborhood function vanishes, the neural weigh t density at convergence approaches a linear function of the input density so that the map can be regarded as a nonparametric model of the input density. We apply eMER to density estimation and compare its performance with that of the SOM algorithm and the variable kernel method. Finally, we apply the ‘batch’ version of eMER to nonparametric projection pursuit regression and compare its performance with that of back-propagation learning, projection pursuit learning, constrained topolog ical mapping, and the Heskes and Kappen approach. Received: 12 August 1996 / Accepted in revised form: 9 April 1997  相似文献   

5.
Self-organizing map (SOM) has been used in protein folding prediction when the HP model is employed. The existing work uses a square-like shape lattice with l = m x n points to represent the optimal compact structure of a sequence of l amino acids. In this paper, a general l'-size sequence of amino acids is self-organized in a two dimensional lattice with l (> l') points. The obtained minimum configuration then has a flexible shape, in contrast to the compact structure limited in the lattice. To fulfil this extension, a new self-organizing map (SOM) technique is proposed to deal with the difficulty of the unsymmetric input and output spaces. New competition rules in the training phase are introduced and a local search method is applied to overcome the multi-mapping phenomena. Several HP benchmark examples with up to 36 amino acids are tested to verify the effectiveness of the proposed approach in this paper.  相似文献   

6.
The use of community-level physiological profiles obtained with Biolog microplates is widely employed to consider the functional diversity of bacterial communities. Biolog produces a great amount of data which analysis has been the subject of many studies. In most cases, after some transformations, these data were investigated with classical multivariate analyses. Here we provided an alternative to this method, that is the use of an artificial intelligence technique, the Self-Organizing Maps (SOM, unsupervised neural network). We used data from a microcosm study of algae-associated bacterial communities placed in various nutritive conditions. Analyses were carried out on the net absorbances at two incubation times for each substrates and on the chemical guild categorization of the total bacterial activity. Compared to Principal Components Analysis and cluster analysis, SOM appeared as a valuable tool for community classification, and to establish clear relationships between clusters of bacterial communities and sole-carbon sources utilization. Specifically, SOM offered a clear bidimensional projection of a relatively large volume of data and were easier to interpret than plots commonly obtained with multivariate analyses. They would be recommended to pattern the temporal evolution of communities' functional diversity.  相似文献   

7.
Memory-based learning schemes are characterized by their memorization of observed events. A memory-based learning scheme either memorizes the collected data directly or reorganizes such information and stores it distributively into a tabular memory. For the tabular type, the system requires a training process to determine the contents of the associative memory. This training process helps filter out zero-mean noise. Since the stored data are associated to pre-assigned input locations, memory management and data retrieval are easier and more efficient. Despite these merits, a drawback of tabular schemes is the difficulty in applying it to high-dimensional problems due to the curse of dimensionality. As the input dimensionality increases, the number of quantized elements in the input space increases at an exponential rate and that causes a huge demand of memory. In this paper, a dynamic tabular structure is proposed for possible relaxation of such a demand. The memorized data are organized as part of a k-d tree. Nodes in the tree, called vertices, correspond to some regularly assigned input points. Memory resource is allocated only at locations where it is needed. Being able to easily compute the vertex positions helps reduce the searching cost in data retrieval. In addition, the learning process is able to expand the tree structure into one covering the problem domain. With memory allocated based on demand, memory consumption becomes closely related to task complexity instead of input dimensionality, and is minimized. Algorithms for structure construction and training are presented in this paper. Simulation results demonstrate that the memory can be efficiently utilized. The developed scheme offers a solution for high-dimensional learning problems with a manageable size of effective input domain.  相似文献   

8.
This article presents SOMCD, an improved method for the evaluation of protein secondary structure from circular dichroism spectra, based on Kohonen's self-organizing maps (SOM). Protein circular dichroism (CD) spectra are used to train a SOM, which arranges the spectra on a two-dimensional map. Location in the map reflects the secondary structure composition of a protein. With SOMCD, the prediction of beta-turn has been included. The number of spectra in the training set has been increased, and it now includes 39 protein spectra and 6 reference spectra. Finally, SOM parameters have been chosen to minimize distortion and make the network produce clusters with known properties. Estimation results show improvements compared with the previous version, K2D, which, in addition, estimated only three secondary structure components; the accuracy of the method is more uniform over the different secondary structures.  相似文献   

9.
10.
In this article, we propose a new learning method called "self-enhancement learning." In this method, targets for learning are not given from the outside, but they can be spontaneously created within a neural network. To realize the method, we consider a neural network with two different states, namely, an enhanced and a relaxed state. The enhanced state is one in which the network responds very selectively to input patterns, while in the relaxed state, the network responds almost equally to input patterns. The gap between the two states can be reduced by minimizing the Kullback-Leibler divergence between the two states with free energy. To demonstrate the effectiveness of this method, we applied self-enhancement learning to the self-organizing maps, or SOM, in which lateral interactions were added to an enhanced state. We applied the method to the well-known Iris, wine, housing and cancer machine learning database problems. In addition, we applied the method to real-life data, a student survey. Experimental results showed that the U-matrices obtained were similar to those produced by the conventional SOM. Class boundaries were made clearer in the housing and cancer data. For all the data, except for the cancer data, better performance could be obtained in terms of quantitative and topological errors. In addition, we could see that the trustworthiness and continuity, referring to the quality of neighborhood preservation, could be improved by the self-enhancement learning. Finally, we used modern dimensionality reduction methods and compared their results with those obtained by the self-enhancement learning. The results obtained by the self-enhancement were not superior to but comparable with those obtained by the modern dimensionality reduction methods.  相似文献   

11.
Wang L  Zhou J  Qu A 《Biometrics》2012,68(2):353-360
We consider the penalized generalized estimating equations (GEEs) for analyzing longitudinal data with high-dimensional covariates, which often arise in microarray experiments and large-scale health studies. Existing high-dimensional regression procedures often assume independent data and rely on the likelihood function. Construction of a feasible joint likelihood function for high-dimensional longitudinal data is challenging, particularly for correlated discrete outcome data. The penalized GEE procedure only requires specifying the first two marginal moments and a working correlation structure. We establish the asymptotic theory in a high-dimensional framework where the number of covariates p(n) increases as the number of clusters n increases, and p(n) can reach the same order as n. One important feature of the new procedure is that the consistency of model selection holds even if the working correlation structure is misspecified. We evaluate the performance of the proposed method using Monte Carlo simulations and demonstrate its application using a yeast cell-cycle gene expression data set.  相似文献   

12.
Humans and animals are able to learn complex behaviors based on a massive stream of sensory information from different modalities. Early animal studies have identified learning mechanisms that are based on reward and punishment such that animals tend to avoid actions that lead to punishment whereas rewarded actions are reinforced. However, most algorithms for reward-based learning are only applicable if the dimensionality of the state-space is sufficiently small or its structure is sufficiently simple. Therefore, the question arises how the problem of learning on high-dimensional data is solved in the brain. In this article, we propose a biologically plausible generic two-stage learning system that can directly be applied to raw high-dimensional input streams. The system is composed of a hierarchical slow feature analysis (SFA) network for preprocessing and a simple neural network on top that is trained based on rewards. We demonstrate by computer simulations that this generic architecture is able to learn quite demanding reinforcement learning tasks on high-dimensional visual input streams in a time that is comparable to the time needed when an explicit highly informative low-dimensional state-space representation is given instead of the high-dimensional visual input. The learning speed of the proposed architecture in a task similar to the Morris water maze task is comparable to that found in experimental studies with rats. This study thus supports the hypothesis that slowness learning is one important unsupervised learning principle utilized in the brain to form efficient state representations for behavioral learning.  相似文献   

13.
Exploitation of microbial wealth, of which almost 95% or more is still unexplored, is a growing need. The taxonomic placements of a new isolate based on phenotypic characteristics are now being supported by information preserved in the 16S rRNA gene. However, the analysis of 16S rDNA sequences retrieved from metagenome, by the available bioinformatics tools, is subject to limitations. In this study, the occurrences of nucleotide features in 16S rDNA sequences have been used to ascertain the taxonomic placement of organisms. The tetra- and penta-nucleotide features were extracted from the training data set of the 16S rDNA sequence, and was subjected to an artificial neural network (ANN) based tool known as self-organizing map (SOM), which helped in visualization of unsupervised classification. For selection of significant features, principal component analysis (PCA) or curvilinear component analysis (CCA) was applied. The SOM along with these techniques could discriminate the sample sequences with more than 90% accuracy, highlighting the relevance of features. To ascertain the confidence level in the developed classification approach, the test data set was specifically evaluated for Thiobacillus, with Acidiphilium, Paracocus and Starkeya, which are taxonomically reassigned. The evaluation proved the excellent generalization capability of the developed tool. The topology of genera in SOM supported the conventional chemo-biochemical classification reported in the Bergey manual.  相似文献   

14.
In single-particle analysis, a three-dimensional (3-D) structure of a protein is constructed using electron microscopy (EM). As these images are very noisy in general, the primary process of this 3-D reconstruction is the classification of images according to their Euler angles, the images in each classified group then being averaged to reduce the noise level. In our newly developed strategy of classification, we introduce a topology representing network (TRN) method. It is a modified method of a growing neural gas network (GNG). In this system, a network structure is automatically determined in response to the images input through a growing process. After learning without a masking procedure, the GNG creates clear averages of the inputs as unit coordinates in multi-dimensional space, which are then utilized for classification. In the process, connections are automatically created between highly related units and their positions are shifted where the inputs are distributed in multi-dimensional space. Consequently, several separated groups of connected units are formed. Although the interrelationship of units in this space are not easily understood, we succeeded in solving this problem by converting the unit positions into two-dimensional (2-D) space, and by further optimizing the unit positions with the simulated annealing (SA) method. In the optimized 2-D map, visualization of the connections of units provided rich information about clustering. As demonstrated here, this method is clearly superior to both the multi-variate statistical analysis (MSA) and the self-organizing map (SOM) as a classification method and provides a first reliable classification method which can be used without masking for very noisy images.  相似文献   

15.
16.
Non-linear data structure extraction using simple hebbian networks   总被引:1,自引:0,他引:1  
. We present a class a neural networks algorithms based on simple hebbian learning which allow the finding of higher order structure in data. The neural networks use negative feedback of activation to self-organise; such networks have previously been shown to be capable of performing principal component analysis (PCA). In this paper, this is extended to exploratory projection pursuit (EPP), which is a statistical method for investigating structure in high-dimensional data sets. As opposed to previous proposals for networks which learn using hebbian learning, no explicit weight normalisation, decay or weight clipping is required. The results are extended to multiple units and related to both the statistical literature on EPP and the neural network literature on non-linear PCA. Received: 30 May 1994/Accepted in revised form: 18 November 1994  相似文献   

17.
A consensus map of QTLs controlling the root length of maize   总被引:1,自引:0,他引:1  
Despite their low carbon (C) content, most subsoil horizons contribute to more than half of the total soil C stocks, and therefore need to be considered in the global C cycle. Until recently, the properties and dynamics of C in deep soils was largely ignored. The aim of this review is to synthesize literature concerning the sources, composition, mechanisms of stabilisation and destabilization of soil organic matter (SOM) stored in subsoil horizons. Organic C input into subsoils occurs in dissolved form (DOC) following preferential flow pathways, as aboveground or root litter and exudates along root channels and/or through bioturbation. The relative importance of these inputs for subsoil C distribution and dynamics still needs to be evaluated. Generally, C in deep soil horizons is characterized by high mean residence times of up to several thousand years. With few exceptions, the carbon-to-nitrogen (C/N) ratio is decreasing with soil depth, while the stable C and N isotope ratios of SOM are increasing, indicating that organic matter (OM) in deep soil horizons is highly processed. Several studies suggest that SOM in subsoils is enriched in microbial-derived C compounds and depleted in energy-rich plant material compared to topsoil SOM. However, the chemical composition of SOM in subsoils is soil-type specific and greatly influenced by pedological processes. Interaction with the mineral phase, in particular amorphous iron (Fe) and aluminum (Al) oxides was reported to be the main stabilization mechanism in acid and near neutral soils. In addition, occlusion within soil aggregates has been identified to account for a great proportion of SOM preserved in subsoils. Laboratory studies have shown that the decomposition of subsoil C with high residence times could be stimulated by addition of labile C. Other mechanisms leading to destabilisation of SOM in subsoils include disruption of the physical structure and nutrient supply to soil microorganisms. One of the most important factors leading to protection of SOM in subsoils may be the spatial separation of SOM, microorganisms and extracellular enzyme activity possibly related to the heterogeneity of C input. As a result of the different processes, stabilized SOM in subsoils is horizontally stratified. In order to better understand deep SOM dynamics and to include them into soil C models, quantitative information about C fluxes resulting from C input, stabilization and destabilization processes at the field scale are necessary.  相似文献   

18.
A new more efficient variant of a recently developed algorithm for unsupervised fuzzy clustering is introduced. A Weighted Incremental Neural Network (WINN) is introduced and used for this purpose. The new approach is called FC-WINN (Fuzzy Clustering using WINN). The WINN algorithm produces a net of nodes connected by edges, which reflects and preserves the topology of the input data set. Additional weights, which are proportional to the local densities in input space, are associated with the resulting nodes and edges to store useful information about the topological relations in the given input data set. A fuzziness factor, proportional to the connectedness of the net, is introduced in the system. A watershed-like procedure is used to cluster the resulting net. The number of the resulting clusters is determined by this procedure. Only two parameters must be chosen by the user for the FC-WINN algorithm to determine the resolution and the connectedness of the net. Other parameters that must be specified are those which are necessary for the used incremental neural network, which is a modified version of the Growing Neural Gas algorithm (GNG). The FC-WINN algorithm is computationally efficient when compared to other approaches for clustering large high-dimensional data sets.  相似文献   

19.
自组织特征映射网络(SOM)是新近引入植物生态学的分析方法,对复杂问题和非线性问题具有较强的分析和求解功能。本研究应用SOM分类和排序研究了庞泉沟自然保护区华北落叶松林。研究结果表明,SOM将120个样方分为7个植物群落类型,分类结果具有明确的生态意义;样方和物种在SOM训练图上呈现一定规律的分布;7个群落类型各有其分布范围和界限,揭示了群落间的生态关系。在此基础上,通过引入一种在SOM训练图上可视化环境因子梯度的方法,能够较好地完成样方、物种和环境因子相互关系的分析,揭示了海拔是影响该区华北落叶松林生长和分布的最主要因子。生态分析表明SOM分类和排序是一种有效的梯度分析方法,适用于表征生态特征和探索群落和环境相互关系的研究。  相似文献   

20.
With a remarkable increase in genomic sequence data of a wide range of species, novel tools are needed for comprehensive analyses of the big sequence data. Self-organizing map (SOM) is a powerful tool for clustering high-dimensional data on one plane. For oligonucleotide compositions handled as high-dimensional data, we have previously modified the conventional SOM for genome informatics: BLSOM. In the present study, we constructed BLSOMs for oligonucleotide compositions in fragment sequences (e.g. 100 kb) from a wide range of vertebrates, including coelacanth, and found that the sequences were clustered primarily according to species without species information. As one of the nearest living relatives of tetrapod ancestors, coelacanth is believed to provide access to the phenotypic and genomic transitions leading to the emergence of tetrapods. The characteristic oligonucleotide composition found for coelacanth was connected with the lowest dinucleotide CG occurrence (i.e. the highest CG suppression) among fishes, which was rather equivalent to that of tetrapods. This evident CG suppression in coelacanth should reflect molecular evolutionary processes of epigenetic systems including DNA methylation during vertebrate evolution. Sequence of a de novo DNA methylase (Dntm3a) of coelacanth was found to be more closely related to that of tetrapods than that of other fishes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号