首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
High dimensional data increase the dimension of space and consequently the computational complexity and result in lower generalization. From these types of classification problems microarray data classification can be mentioned. Microarrays contain genetic and biological data which can be used to diagnose diseases including various types of cancers and tumors. Having intractable dimensions, dimension reduction process is necessary on these data. The main goal of this paper is to provide a method for dimension reduction and classification of genetic data sets. The proposed approach includes different stages. In the first stage, several feature ranking methods are fused for enhancing the robustness and stability of feature selection process. Wrapper method is combined with the proposed hybrid ranking method to embed the interaction between genes. Afterwards, the classification process is applied using support vector machine. Before feeding the data to the SVM classifier the problem of imbalance classes of data in the training phase should be overcame. The experimental results of the proposed approach on five microarray databases show that the robustness metric of the feature selection process is in the interval of [0.70, 0.88]. Also the classification accuracy is in the range of [91%, 96%].  相似文献   

3.
Organisms are classified hierarchically. The reason why such a classification is appropriate is that organisms have arisen by a branching process: as Darwin realized, the best image of the evolutionary process is a tree. But the exchange of genetic material, in the sexual process and in more distant transfer events, means that sometimes a net is a more suitable image. Given the characteristics of a set of objects, can we decide how they arose, and whether a hierarchical classification is appropriate? A geometrical approach to this question has recently been suggested by Manfred Eigen and his colleagues.  相似文献   

4.
Our ability to perceive a stable visual world in the presence of continuous movements of the body, head, and eyes has puzzled researchers in the neuroscience field for a long time. We reformulated this problem in the context of hierarchical convolutional neural networks (CNNs)—whose architectures have been inspired by the hierarchical signal processing of the mammalian visual system—and examined perceptual stability as an optimization process that identifies image-defining features for accurate image classification in the presence of movements. Movement signals, multiplexed with visual inputs along overlapping convolutional layers, aided classification invariance of shifted images by making the classification faster to learn and more robust relative to input noise. Classification invariance was reflected in activity manifolds associated with image categories emerging in late CNN layers and with network units acquiring movement-associated activity modulations as observed experimentally during saccadic eye movements. Our findings provide a computational framework that unifies a multitude of biological observations on perceptual stability under optimality principles for image classification in artificial neural networks.  相似文献   

5.
This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice.  相似文献   

6.
The most widely spread measure of performance, accuracy, suffers from a paradox: predictive models with a given level of accuracy may have greater predictive power than models with higher accuracy. Despite optimizing classification error rate, high accuracy models may fail to capture crucial information transfer in the classification task. We present evidence of this behavior by means of a combinatorial analysis where every possible contingency matrix of 2, 3 and 4 classes classifiers are depicted on the entropy triangle, a more reliable information-theoretic tool for classification assessment.Motivated by this, we develop from first principles a measure of classification performance that takes into consideration the information learned by classifiers. We are then able to obtain the entropy-modulated accuracy (EMA), a pessimistic estimate of the expected accuracy with the influence of the input distribution factored out, and the normalized information transfer factor (NIT), a measure of how efficient is the transmission of information from the input to the output set of classes.The EMA is a more natural measure of classification performance than accuracy when the heuristic to maximize is the transfer of information through the classifier instead of classification error count. The NIT factor measures the effectiveness of the learning process in classifiers and also makes it harder for them to “cheat” using techniques like specialization, while also promoting the interpretability of results. Their use is demonstrated in a mind reading task competition that aims at decoding the identity of a video stimulus based on magnetoencephalography recordings. We show how the EMA and the NIT factor reject rankings based in accuracy, choosing more meaningful and interpretable classifiers.  相似文献   

7.
The accurate identification of rice varieties using rapid and nondestructive hyperspectral technology is of practical significance for rice cultivation and agricultural production. This paper proposes a convolutional neural network classification model based on a self-attention mechanism (self-attention-1D-CNN) to improve accuracy in distinguishing between crop species in fields using canopy spectral information. After experimental materials were planted in the research area, portable equipment was used to collect the canopy hyperspectral data for rice during the booting stage. Five preprocessing methods and three extraction methods were used to process the data. A comparison of the classification accuracy of different classification models showed that the self-attention-1D-CNN proposed in this study achieved the best classification with an accuracy of 99.93%. The research demonstrated the feasibility of using hyperspectral technology for the fine classification of rice varieties, and the feasibility of using the CNN model as a potential classification method for near-ground crop monitoring and classification.  相似文献   

8.
9.
子囊菌具有无性态与有性态的复杂性,以及人们对其系统发育和亲缘关系了解的局限性,进而导致菌物学家对子囊菌分类尚持不同意见。子囊菌的交配型基因(MAT)进化保守,且编码的蛋白质调控子囊菌的有性生殖过程。核盘菌Sclerotinia sclerotiorum (Lib.) de Bary隶属于子囊菌门Ascomycota、盘菌纲Discomycete,是一种典型丝状同宗配合真菌,控制该菌有性生殖的交配型基因MAT1-1MAT1-2紧密连锁,且该菌并无有性态与无性态的复杂性。故此,本文根据所克隆的核盘菌交配型基因MAT1-1,利用PAUP*软件将82种含有Alpha-box交配型基因的子囊菌进行了系统进化分析,通过核苷酸及氨基酸水平的系统发育分析,并结合Ainsworth(1973)分类系统及最新的Deep Hyphae(2006)分类系统的对比研究,发现所构建的系统进化树与传统分类所表现的进化关系基本一致,且核盘菌交配型基因MAT1-1在进化过程中功能相对保守,该分析结果有助于对其他子囊菌交配型基因的克隆、系统分类与进化研究,同时对核盘菌的亲缘关系、病害预测及防治等具有重要意义。  相似文献   

10.
湿地景观生态分类研究进展   总被引:4,自引:1,他引:4  
曹宇  莫利江  李艳  章文妹 《应用生态学报》2009,20(12):3084-3092
湿地景观生态分类是湿地景观生态学研究的前提与基础,直接影响湿地研究结果的精度和有效性.基于国内外湿地景观生态分类在分类理论、分类指标、分类方法等方面的研究历史、现状与最新进展,本文系统介绍并评价了NWI、Ramsar、HGM等湿地分类体系,指出基于HGM分类思想以及综合考量湿地空间结构、生态功能、生态过程、地形、土壤、植被、水文、人类活动干扰强度等多种因素的混合分类方法是该研究领域未来发展的主要方向.集成运用3S技术、数学定量、景观模型、知识工程、人工智能、神经网络等多种方法以提高分类的自动化水平与精度,将是今后湿地景观生态分类研究的重点与难点.  相似文献   

11.
The functional basis of a primary succession resolved by CSR classification   总被引:3,自引:0,他引:3  
CSR classification aims to apply CSR theory to large numbers of plants in situ, thereby allowing the investigation of communities within a functional context. However, it has only ever been applied to British vegetation, during the development of the technique, and has not yet been used to investigate specific vegetation processes. Here, a vegetation primary succession on a glacier foreland (Rutor glacier, Aosta, Italy) was used as a 'test bed' for the hypothesis that CSR classification can distinguish functional shifts during this vegetation process. Morpho-functional traits were used to calculate CSR coordinates for 45 species throughout the glacier foreland. General functional similarities between species were verified using principal components analysis (PCA). CSR classification demonstrated a functional shift from broadly ruderal pioneers towards stress-tolerance in late succession. PCA 1 correlated with S and R strategies, confirming this gradient. Till deposited at the retreating glacier terminus provides a substrate that can support faster growing species (with high foliar N contents), but is only tenable to those that can avoid physical disturbance via rapid phenological development (i.e. ruderals). Stress-tolerance and lower N contents in late succession suggest selection for efficient nutrient use. CSR classification demonstrated that competitive traits were ubiquitous but of much lesser importance than stress-tolerance or ruderalism (also correlating with PCA 2 and 3). The detailed visualization provided by CSR classification, combined with its mechanistic explanation of community change, demonstrate the promise of this methodology as a quantitative tool for comparative community ecology.  相似文献   

12.
N. S. Mitchell  R. L. Cruess 《CMAJ》1977,117(7):763-765
It is suggested that the former division of degenerative arthritis into idiopathic types and those secondary to some disease process is no longer valid. Recent studies have indicated that abnormal concentrations of force on cartilage lead to the development of this disease. A classification is presented that is based on the assumption that the process is initiated by abnormal concentrations of force on normal cartilage matrix, normal concentrations of force on abnormal cartilage matrix or normal concentrations of force on normal cartilage matrix that is supported by bone of abnormal consistency.  相似文献   

13.
Ecological regions or ecoregions derive from ecological classification of land and represent broad and discrete ecologically homogeneous areas within which natural communities and species interact with the physical elements of the environment. The aim of this paper is to define the ecoregions of Italy, southern Europe, based on a robust methodological process for classification and mapping. The ecoregions of Italy comprise 2 Divisions, 7 Provinces, 11 Sections and 33 Subsections and constitute the first comprehensive ecological classification of the country that integrates accurate and updated cartographies and knowledges on climate, vegetation, land units and biogeography. This classification has the strength to be adopted as a proper framework for ecological modelling, biodiversity conservation policies and sustainable territorial planning at the national and subnational level.  相似文献   

14.
In single-particle analysis, a three-dimensional (3-D) structure of a protein is constructed using electron microscopy (EM). As these images are very noisy in general, the primary process of this 3-D reconstruction is the classification of images according to their Euler angles, the images in each classified group then being averaged to reduce the noise level. In our newly developed strategy of classification, we introduce a topology representing network (TRN) method. It is a modified method of a growing neural gas network (GNG). In this system, a network structure is automatically determined in response to the images input through a growing process. After learning without a masking procedure, the GNG creates clear averages of the inputs as unit coordinates in multi-dimensional space, which are then utilized for classification. In the process, connections are automatically created between highly related units and their positions are shifted where the inputs are distributed in multi-dimensional space. Consequently, several separated groups of connected units are formed. Although the interrelationship of units in this space are not easily understood, we succeeded in solving this problem by converting the unit positions into two-dimensional (2-D) space, and by further optimizing the unit positions with the simulated annealing (SA) method. In the optimized 2-D map, visualization of the connections of units provided rich information about clustering. As demonstrated here, this method is clearly superior to both the multi-variate statistical analysis (MSA) and the self-organizing map (SOM) as a classification method and provides a first reliable classification method which can be used without masking for very noisy images.  相似文献   

15.
Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes’ behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA) to model the network generation process, nodes’ connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions.  相似文献   

16.
爪哇虫草菌是广谱性虫生真菌,具有防治膜翅目社会性昆虫红火蚁Solenopsis invicta的生防潜能。为探究蚁巢弃尸堆中携菌虫体在侵染循环链的关键作用,探讨腐生条件下爪哇虫草Cordyceps javanica响应寄主的分子致病机制,本研究通过向培养基添加冷冻干燥的虫尸粉进行诱导,对诱导前后的菌丝孢子混合体进行转录组测序分析。结果表明,与纯培养条件相比,菌株经虫尸粉诱导后发掘新基因1 912个,有379个得到功能注释。显著差异表达基因有242个,上调表达基因111个,下调表达基因131个;GO富集分析表明,生物学过程的代谢过程和细胞过程、细胞组分的细胞结构体及分子功能中参与催化活性和结合相关的基因可能在爪哇虫草响应寄主过程中发挥了重要作用;KEGG分类和富集分析表明,代谢过程分类的色氨酸代谢、缬氨酸、亮氨酸和异亮氨酸降解、糖酵解/糖原异生和乙醛酸和二羧酸酯代谢,遗传信息过程分类的碱基切除修复和内质网的蛋白质加工,环境信息过程分类的ABC转运蛋白和丝裂原活化蛋白激酶信号通路,细胞过程分类的自噬和过氧物酶体通路是爪哇虫草的主要代谢途径。随机筛选8个差异表达基因进行实时荧光定量PCR验证,这些基因的表达模式与转录组数据分析结果基本一致。本研究结果表明,爪哇虫草在适应腐生环境过程中与氨基酸代谢和能量代谢相关的基因活跃,这可能是其在垂直和水平扩散过程中需维持侵染力,为自身提供营养并转化为能量和中间产物等物质所致。  相似文献   

17.
逐步聚类法及其应用   总被引:10,自引:3,他引:10       下载免费PDF全文
本文介绍了一种非等级分类方法——逐步聚类法,并将其应用于翅果油树灌丛的数量分类研究,结果表明:逐步聚类法实现最优分类的目标过程,是依样方组内具有最小的离差平方和。样方组间具有最大的离差平方和为标准,使样方组内具有最大的同质性,样方组间具有最大的异质性,其分类结果与实际情况吻合度较高;其次,逐步聚类法只需计算每个样方到该样方形心的距离,可缩短计算时间和节省计算机内存单元,提高工作效率。 与模糊c—均值聚类和TWINSPAN结果相比,逐步聚类的结果类似于模糊c—均值聚类,即样方组内具有较高的同质性;在不要求分类结果具有明显上下级关系的前提下,逐步聚类结果要优于TWINSPAN。  相似文献   

18.
Integrating gene regulatory networks (GRNs) into the classification process of DNA microarrays is an important issue in bioinformatics, both because this information has a true biological interest and because it helps in the interpretation of the final classifier. We present a method called graph-constrained discriminant analysis (gCDA), which aims to integrate the information contained in one or several GRNs into a classification procedure. We show that when the integrated graph includes erroneous information, gCDA's performance is only slightly worse, thus showing robustness to misspecifications in the given GRNs. The gCDA framework also allows the classification process to take into account as many a priori graphs as there are classes in the dataset. The gCDA procedure was applied to simulated data and to three publicly available microarray datasets. gCDA shows very interesting performance when compared to state-of-the-art classification methods. The software package gcda, along with the real datasets that were used in this study, are available online: http://biodev.cea.fr/gcda/.  相似文献   

19.
Legume systematists have been making great progress in understanding evolutionary relationships within the Leguminosae (Fabaceae), the third largest family of flowering plants. As the phylogenetic picture has become clearer, so too has the need for a revised classification of the family. The organization of the family into three subfamilies and 42 tribes is outdated and evolutionarily misleading. The three traditionally recognized subfamilies, Caesalpinioideae, Mimosoideae, and Papilionoideae, do not adequately represent relationships within the family. The occasion of the Sixth International Legume Conference in Johannesburg, South Africa in January 2013, with its theme “Towards a new classification system for legumes,” provided the impetus to move forward with developing a new classification. A draft classification, based on current phylogenetic results and a set of principles and guidelines, was prepared in advance of the conference as the basis for discussion. The principles, guidelines, and draft classification were presented and debated at the conference. The objectives of the discussion were to develop consensus on the principles that should guide the development of the classification, to discuss the draft classification's strengths and weaknesses and make proposals for its revision, and identify and prioritize phylogenetic deficiencies that must be resolved before the classification could be published. This paper describes the collaborative process by a large group of legume systematists, publishing under the name Legume Phylogeny Working Group, to develop a new phylogenetic classification system for the Leguminosae. The goals of this paper are to inform the broader legume community, and others, of the need for a revised classification, and spell out clearly what the alternatives and challenges are for a new classification system for the family.  相似文献   

20.
The higher levels of the classification of transposable elements (TEs) from Classes to Superfamilies or Families, is regularly updated, but the lower levels (below the Family) have received little investigation. In particular, this applies to the Families that include a large number of copies. In this article we propose an automatic classification of DNA sequences. This procedure is based on an aggregation process using a pairwise matrix of distances, allowing us to define several groups characterized by a sphere with a central sequence and a radius. This method was tested on the mariner Family, because this is probably one of the most extensively studied Families. Several Subfamilies had already been defined from phylogenetic analyses based on multiple alignments of complete or partial amino-acid sequences of the transposase. The classification obtained here from DNA sequences of 935 items matches the phylogenies of the transposase. The rate of error from a posteriori re-assignment is relatively low.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号