首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 529 毫秒
1.
主元余像集主成分分析在蛋白质质谱数据中的应用   总被引:1,自引:1,他引:0  
癌蛋白质谱数据中包含了大量未知的内部结构和变量。针对癌蛋白质谱数据这些特点,在总结主元余像集主成分分析(二次主成分分析)应用的基础上,提出了用t-验证方法进行特征子集选取,然后用主元余像集主成分分析提取特征,以线性判别分析进行分类的新方法。通过对典型癌蛋白质谱数据的分类实验,证明该方法不但识别率高,而且需要选取的特征子集小,分类速度快,提高了方法的准确性与分类速度。  相似文献   

2.
文章研究了基于微阵列基因表达数据的胃癌亚型分类。微阵列基因表达数据样本少、纬度高、噪声大的特点,使得数据降维成为分类成功的关键。作者将主成分分析(PCA) 和偏最小二乘(PLS)两种降维方法应用于胃癌亚型分类研究,以支持向量机(SVM)、K- 近邻法(KNN)为分类器对两套胃癌数据进行亚型分类。分类效果相比传统的医理诊断略高,最高准确率可达100%。研究结果表明,主成分分析和偏最小二乘方法能够有效地提取分类特征信息,并能在保持较高的分类准确率的前提下大幅度地降低基因表达数据的维数。  相似文献   

3.
特征选择技术被广泛应用于生物信息学中。通过重复利用偏最小二乘(partial least square,PLS)方法提取主成分,通过逐次选择在主成分中权重较大的基因,将PLS应用于特征选择中。将这种方法用于对肿瘤基因表达谱数据的特征基因选择中,并用提取的特征基因分类,用8个特征基因进行分类时,能达到92.5%的正确率。  相似文献   

4.
应用多元统计成分分析法,对不同剂量60Coγ射线辐射大杯香菇诱发的蛋白质营养价值的6个主要营养指标进行了相关及主成分分析.结果表明,大杯香菇的大部分营养指标间存在极显著或显著相关.主成分分析显示,可将原6个营养指标综合为2个主成分,其累积贡献率可达96.20%.根据主成分对原性状的综合能力,进行营养价值分析,2个主成分可分别解释为生物价和必需氨基酸指数效应因子和营养指数效应因子,并建立了主成分方程,各处理组的主成分得分、综合得分计算排序第一为1.25 kGy处理组,所以为较适宜的辐射剂量.  相似文献   

5.
为探讨人工神经网络(ANN)在昆虫分类上的可行性,本文提出利用主成分分析和数学建模等方法相结合改进ANN,并以鳞翅目夜蛾科6种蛾类昆虫为样本进行了验证.首先利用Bugshape1.0特征提取软件获取6种蛾180个右前翅样本的13项数学形态特征数据,再运用主成分分析对蛾翅数学形态特征变量重新组合生成新的综合变量,然后结合主成分分析建立BP神经网络分类器.主成分分析结果表明,前5个主成分的累积贡献率为85.52%,已基本包含了全部特征变量具有的信息.在主成分分析的基础上,建立具有5个输入层节点,12个隐含层节点和1个输出层节点的三层BP神经网络分类器.每种蛾20个样本共120组特征数据对分类器进行训练和仿真,其余60组特征数据对分类器进行验证,仿真输出值与目标值的相关系数R=0.997,分类正确率达到了93.33%.较之未经过主成分分析而单独使用BP神经网络建立的分类器,基于主成分分析的BP神经网络分类器具有更优的性能和更准确的分类能力.研究结果表明本文提出的方法具有很好的分类和鉴别作用,为蛾种类的鉴别提供了一种可行的方法.  相似文献   

6.
本文对中国海菜花属全部已知分类群的84个形态学、生物学和生态学性状,进行了R分析和主成分分析。R分析结果表明,被研究的性状可明显地划分为若干组高度相关的性状,它们或表明本属向次生性陆生和异花传粉方向演化,或具重要的分类价值。主成分分析结果表明,仅前16个主成分几乎可保留84个性状的全部信息量,这说明在中国海菜花属分类研究中所选性状很合理;前3个主成分可保留总信息量的76.56%,说明在三维空间内能较好地反映中国海菜花属所有已知分类群间的相对位置。  相似文献   

7.
基于投影寻踪的天然草地分类模型   总被引:14,自引:0,他引:14  
金菊良  张礼兵  潘金锋 《生态学报》2003,23(10):2184-2188
提出了基于投影寻踪的天然草地分类模型(GQC-RAGAPP),利用该模型可把各天然草地多维分类指标值综合成一维投影值,投影值越大表示该草地的环境综合质量越高,根据投影值的大小就可对草地样本集进行合理分类。建议用实码加速遗传算法进行GQC—RAGAPP的建模,简化了投影寻踪技术的实现过程,克服了目前投影寻踪技术计算过程复杂、编程实现困难的缺点。实例计算的结果说明,直接由样本数据驱动的GQC—RAGAPP模型用于天然草地分类简便可行,适用性和可操作性较强,可应用于各种非线性、非正态高维数据分类、评价等区域可持续发展研究中。  相似文献   

8.
该研究对不同基原绿绒蒿的化学成分差异与品种分类基础进行了分析。采用超高效液相色谱-四级杆-飞行时间质谱(UPLC-Q-TOF-MS)对多刺绿绒蒿、总状绿绒蒿、五脉绿绒蒿、全缘绿绒蒿、红花绿绒蒿共49个批次绿绒蒿药材进行检测,ESI源正、负离子扫描模式,将数据结果导入PeakView 1.2软件,以Formula Finder、Mass Calculators、XIC manager等功能及二级碎片裂解规律进行定性分析,并将定性结果建立已知成分筛查表。将数据代入SIMCA-P 14.1软件中,进行可视化处理,构建主成分分析(principal component analysis,PCA)及偏最小二乘法-判别分析(partial least squares discrimination-analysis,PLS-DA)数学模型。结果共检测并分析出75种化学成分;PCA及PLS-DA结果表明从所含化学成分的种类角度出发,多刺绿绒蒿和总状绿绒蒿所含的化学成分种类基本一致,能较好聚集,其他3个基原的绿绒蒿所含化学成分种类差异较大。该研究根据现有文献报道,对绿绒蒿属植物化学成分进行汇总,结合质谱裂解规律对各样品中的化学成分进行推测与对比,为绿绒蒿的品种分类鉴别奠定了基础。  相似文献   

9.
 在森林植被生物量遥感动态监测方面最基础性的研究是探讨生物量与遥感数据及其派生数据、地形数据和气象数据之间的相关性。为此,以我国云南省西双版纳的热带森林植被为例,分别对幼龄林、中龄林、近熟林和成过熟林的生物量与其对应的LANDSAT TM数据及其派生数据、气象数据和地形数据之间的相关性进行了分析。首先,利用森林资源连续清查的林业固定样地数据,通过各树种组的各器官生物量估算模型计算出各样地森林植被的生物量,并根据样地的坐标来建立样地GIS数据库。然后,利用地形图对遥感图像进行几何校正,并对遥感图像进行主成分变换、缨帽变换以及植被指数的计算来产生其派生数据。其次,将栅格样地数据、遥感数据(如LANDSAT TM数据)及其派生数据(如各种植被指数数据、主成分数据、缨帽变换的亮度、绿度和湿度数据)、栅格地形数据(如DEM和坡向)和栅格气象数据(包括年平均温度、大于0 ℃的积温、年平均降雨量和湿润度)统一到同一坐标系和投影下,并将所有的数据内插为30 m分辨率的格网数据,利用样地数据与遥感数据及其派生数据、地形数据和气象数据进行栅格空间叠加分析,从而得到各样地的样地数据、遥感数据及其派生数据、地形数据和气象数据。再次,根据各样地优势树种所属的龄组将所有的数据层化为幼龄林、中龄林、近熟林和成过熟林等几个不同龄组的样本数据。最后,分别对幼龄林、中龄林、近熟林和成过熟林的样地生物量与其对应的遥感数据和派生数据、气象数据和地形数据进行相关性分析。研究表明,在所有的因子中,幼龄林的生物量与LANDSAT 的TM1和TM6波段的亮度值在0.05的水平上呈显著相关,其相关系数均为-0.33;中龄林的生物量与降雨量在0.05的水平上呈显著相关,其相关系数为0.33;近熟林的生物量与LANDSAT TM的派生数据VI3、LANDSAT的TM4和缨帽变换的亮度值在0.05的水平上呈显著相关,其相关系数分别为0.50、-0.45和-0.45;成过熟林的生物量与主成分变换的第二主成分(PC2)在0.05的水平上呈显著相关,其相关系数为-0.46。在0.05的水平上,近熟林的生物量与LANDSAT TM的派生数据VI3的相关系数最高,达到0.50,其次是成过熟林的生物量与主成分变换的第二主成分的相关系数,为-0.46。  相似文献   

10.
本文用主成分分析法对夜蛾科卵的分类性状进行定量分析,用以说明各主成分的生物含义和各变量对分类的重要性.  相似文献   

11.
李超  李娟  张明理 《西北植物学报》2013,33(11):2339-2345
通过观测淫羊藿属植物的30个质量性状和15个数量性状,利用聚类分析和主成分分析的方法研究淫羊藿属属下类群的分类关系。结果表明:(1)聚类分析结果将淫羊藿属中国种类划分为大花类群和小花类群,支持Stearn对Sect.Macroceras、Sect.Polyphyllon和Subg.Rhizophyllum的处理,但认为Sect.Epimedium的分类地位尚需进一步探讨。(2)主成分分析结果显示,性状的累积贡献率不是很高,前3个主成分累积贡献率为51.86%,这可能与本属植物演化过程中性状变异的多样化和复杂化相关,但由主成分分析的结果仍可以看出中国种类被划分为大花类群和小花类群。研究认为,花瓣与内萼片长度比、花瓣是否具距、萼片轮数等作为主成分反映的性状对淫羊藿属分类具有重要价值。  相似文献   

12.
13.
We analyzed variability of morphological characters and genetic polymorphism of inter-simple sequence repeat (ISSR) markers in nine natural populations of three Lotus species from Eastern Europe, aiming to provide insights into the nature of the species L. ucrainicus. Nuclear ribosomal internal transcribed spacer (nrITS) was used as an additional molecular marker for a small subset of accessions. Analysis of variance, and principal coordinate and principal component analyses were applied for morphological data study. Cluster analysis [unweighted pair-group method with arithmetic mean (UPGMA)], principal coordinate analysis, and analysis of population genetic structure were used for ISSR pattern study. Morphological and genetic (ISSR, nrITS) evidence suggested hybrid origin of L. × ucrainicus as a result of hybridization between tetraploid species L. corniculatus and diploid species L. stepposus. We conclude that L. × ucrainicus may represent a case of hybrid speciation in statu nascendi, occurring before our very eyes.  相似文献   

14.
We applied an integrative approach to re-evaluate the taxonomy of the conchologically highly diverse land snail genus Rossmaessleria from the Rif Mountains in Morocco and from Gibraltar, which has been classified into 12 nominal species so far. An analysis of cox1 and 16S rDNA sequences using the General Mixed Yule-coalescent approach with a single or multiple thresholds, its Bayesian implementation as well as the Automatic Barcode Gap Discovery method indicated that all Rossmaessleria populations can be classified into a single species, R. scherzeri (Zelebor, 1867). This result is confirmed by the lack of diagnostic differences in the genitalia as shown in a principal component analysis of the genital measurements. The variation of shell characters also does not allow an unambiguous subdivision of the complex. However, the populations of a mountain or a mountain ridge share characteristic combinations of shell characters so that they can be classified as geographic subspecies. The delimitation of the subspecies and their distribution is discussed and three subspecies are described as new to science: R. scherzeri periclitata ssp. nov. R. scherzeri ingae ssp. nov., and R. scherzeri eleanorae ssp. nov.

http://zoobank.org/urn:lsid:zoobank.org:pub:187FE235-A257-423F-8FC3-957117546400  相似文献   

15.
I propose a methodology to obtain and compare integral information on bird plumage coloration, using colour spectral data to conduct studies on geographic variation and taxonomy of different bird groups. I used principal component analysis and discriminant function analysis to compare groups of individuals by plumage coloration. As examples of the application of the methodology, I compared populations within the genus Eulampis and Anthracothorax. The results indicate possible taxonomic inadequacies and reveal situations that deserve further analysis, demonstrating the potential of the methodology in this area.Electronic Supplementary Material Supplementary material is available for this article at and is accessible for authorized users.  相似文献   

16.
An electronic nose (EN) device was used to detect microbial and viral contaminations in a variety of animal cell culture systems. The emission of volatile components from the cultures accumulated in the bioreactor headspace, was sampled and subsequently analysed by the EN device. The EN, which was equipped with an array of 17 chemical gas sensors of varying selectivity towards the sampled volatile molecules, generated response patterns of up to 85 computed signals. Each 15 or 20 min a new gas sample was taken generating a new response pattern. A software evaluation tool visualised the data mainly by using principal component analysis. The EN was first used to detect microbial contaminations in a Chinese hamster ovary (CHO) cell line producing a recombinant human macrophage colony stimulating factor (rhM-CSF). The CHO cell culture was contaminated by Escherichia coli, Pseudomonas aeruginosa, Staphylococcus aureus and Candida utilis which all were detected. The response patterns from the CHO cell culture were compared with monoculture references of the microorganisms. Second, contaminations were studied in an Sf-9 insect cell culture producing another recombinant protein (VP2 protein). Contaminants were detected from E. coli, a filamentous fungus and a baculovirus. Third, contamination of a human cell line, HEK-293, infected with E. coli exhibited comparable results. Fourth, bacterial contaminations could also be detected in cultures of a MLV vector producer cell line. Based on the overall experiences in this study it is concluded that the EN method has in a number of cases the potential to be developed into a useful on-line contamination alarm in order to support safety and economical operation for industrial cultivation.  相似文献   

17.
The present research was undertaken to determine the relationship between patterns of generalized intrapopulational variation determined from principal components analysis and patterns of sexual dimorphism determined from Student's t and discriminant function analysis. The analysis was based upon 17 measurements of 97 femurs from a Middle Mississippian Amerindian population. The results of the principal components analysis indicated that the 17 original measurements could be represented as four principal component variates. Inspection of component loadings lent support to the contention that the first principal component reflected variation in general size while components two to four reflected variation in femoral shape. Analysis of the relationship between principal component loading and male-female differences reflected in Student's t demonstrated a high (0.97) positive correlation between absolute magnitude of loading in the first principal component and magnitude of Student's t. As a result, discriminant analyses of the femur, utilizing univariate criteria for the inclusion of variables, have been biased in the direction of size variation. Subsequent stepwise discriminant function analyses demonstrated that an adequate discriminant model must reflect all dimensions of morphological variation at the intrapopulational level.  相似文献   

18.
In a recent study, the phylogeny of Caseidae (a herbivorous family of Palaeozoic synapsids belonging to the paraphyletic grade known as pelycosaurs) was analysed with a dataset employing more than three hundred continuous morphological characters in an effort to follow the principles of total evidence. Continuous characters are a source of great debate, with disagreements surrounding their suitability for and treatment in phylogenetic analysis. A number of shortcomings were identified in the handling of continuous characters in this study of caseids, including the use of gap weighting to discretize the characters and potential issues with redundancy and character non‐independence. Therefore, an alternative treatment for these characters is suggested here. First, rather than using gap weighting, the continuous characters were analysed in the program TNT, in which the raw values can be treated as continuous rather than discrete. Second, prior to the phylogenetic analysis, the continuous characters were subjected to a log‐ratio principal component analysis, and then the principal components were included in the character matrix rather than the raw ratios. Analysing the original data in TNT produced little difference in the results, but using the principal components as continuous characters resulted in alternative positions for Caseopsis agilis, Ennatosaurus tecton and Caseoides sanangeloensis. The differences are judged to be due to the reduced redundancy of the characters, the smaller number of principal components not overwhelming the discrete characters and the use of a scaling method which allows principal components with a higher variance to have a greater influence on the analysis. The positions of highly fragmentary fossils depended heavily on the method used to treat the missing characters in the principal component analysis, and so the method proposed here is not recommended for analysing very incomplete taxa.  相似文献   

19.
20.
互花米草成功入侵的关键是其生长繁殖能力以及对环境的适应能力,叶片含水率、相对叶绿素含量、碳氮比、总氮、总磷以及比叶面积等叶片功能性状反应的是互花米草对资源的利用能力以及环境的适应能力。以江苏盐城滨海湿地为研究对象,进行互花米草叶片功能性状与高光谱数据的关系研究。通过对原始光谱数据以及一阶微分转换光谱数据进行主成分分析提取新的主成分变量作为自变量分别建立不同性状的逐步回归、BP神经网络、支持向量机、随机森林4种预测模型,通过比较构建模型的R2以及RMSE选择最优模型,进而基于相关性分析得到的敏感波段构建最优模型,验证其准确性和适用性。研究结果发现:(1)一阶微分数据的建模效果优于原始光谱数据;(2)通过对不同功能性状的预测建模,发现4种模型的预测效果排序为:随机森林>支持向量机>BP神经网络>逐步回归,其中随机森林模型的准确性高、稳定性强,明显优于其他3种模型,而逐步回归模型的效果最差,不适用于互花米草叶片功能性状的高光谱建模;(3)通过对相关性分析得到的敏感波段建立随机森林模型,建模R2均大于0.90,验证R2介于0.73-0.95之间,进一步证实了随机森林模型的准确性和稳定性。研究结果表明,高光谱数据可以作为快速监测互花米草生长状况的有力手段,而随机森林模型可以作为高精度模型实现对互花米草不同叶片功能性状的估测。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号