首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
    
《Genetics》2021,218(1)
  相似文献   

2.
应用主成分分析方法研究了阿拉善荒漠区白沙蒿飞播区土壤因子和白沙蒿生长状况,阐明了土壤性质的主要限制性因子和群落生长最主要的状况指标。结果表明:(1)各个土壤因子中,6月的土壤含水量是各土壤性质中最重要的因子,贡献率达到63.6%。土壤的分形维数也对土壤性质有重要作用,贡献率为8.8%。土壤中N、P、K含量的贡献率分别为5.4%、3.5%、3.5%,也是不容忽视的影响因子。(2)地上生物量最能反映植株的生长状况,其贡献率达到60.1%,而且和几种最主要的土壤因子都有极高的相关性。(3)对土壤因子的第一和第二主成分的排序结果显示,不同白沙蒿种群密度分为3个类型,而种群密度在2.1株/m2时,种群生长状况最佳。  相似文献   

3.
Researchers often use a two-step process to analyze multivariate data. First, dimensionality is reduced using a technique such as principal component analysis, followed by a group comparison using a t-test or analysis of variance. Although this practice is often discouraged, the statistical properties of this procedure are not well understood, starting with the hypothesis being tested. We suggest that this approach might be considering two distinct hypotheses, one of which is a global test of no differences in the mean vectors, and the other being a focused test of a specific linear combination where the coefficients have been estimated from the data. We study the asymptotic properties of the two-sample t-statistic for these two scenarios, assuming a nonsparse setting. We show that the size of the global test agrees with the presumed level but that the test has poor power. In contrast, the size of the focused test can be arbitrarily distorted with certain mean and covariance structures. A simple method is provided to correct the size of the focused test. Data analyses and simulations are used to illustrate the results. Recommendations on the use of this two-step method and the related use of principal components for prediction are provided.  相似文献   

4.
人类群体遗传结构的协方差阵主成分分析方法   总被引:3,自引:0,他引:3       下载免费PDF全文
目的:探讨基因频率矩阵的中心化(或均值化)协方差阵主成分分析方法在人类群体遗传结构研究中的适用性和合理性。方法:从基因频率矩阵的结构特征入手,分析中心化、均值化协方差阵主成分分析与标准化相关阵主成分分析在特征根、特征向量以及降维效果等方面的差异,并通过实例比较不同方法在解释群体遗传结构特征上合理性。结果:中心化(或均值化)协方差阵的主成分不仅反映了基因变异程度的“方差信息量权”,而且反映了基因间相互影响程度的“相关信息量权”;标准化相关阵的主成分反映的仅是“相关信息量权”,不包括“方差信息量权”。通过比较中国26个汉族人群HLA-A基因座中心化协方差阵和标准化相关阵2种主成分分析结果,证实中心化协方差阵主成分分析方法在特征根与特征向量、保留主成分的个数和对主成分的群体遗传学解释的合理性等方面均优于标准化相关阵主成分分析方法。结论:在对群体遗传结构进行主成分分析时,应使用中心化(或均值化)变换消除基因频率矩阵中量级的影响,然后在用其协方差阵提取主成分。  相似文献   

5.
    
Functional principal component analysis (FPCA) has been widely used to capture major modes of variation and reduce dimensions in functional data analysis. However, standard FPCA based on the sample covariance estimator does not work well if the data exhibits heavy-tailedness or outliers. To address this challenge, a new robust FPCA approach based on a functional pairwise spatial sign (PASS) operator, termed PASS FPCA, is introduced. We propose robust estimation procedures for eigenfunctions and eigenvalues. Theoretical properties of the PASS operator are established, showing that it adopts the same eigenfunctions as the standard covariance operator and also allows recovering ratios between eigenvalues. We also extend the proposed procedure to handle functional data measured with noise. Compared to existing robust FPCA approaches, the proposed PASS FPCA requires weaker distributional assumptions to conserve the eigenspace of the covariance function. Specifically, existing work are often built upon a class of functional elliptical distributions, which requires inherently symmetry. In contrast, we introduce a class of distributions called the weakly functional coordinate symmetry (weakly FCS), which allows for severe asymmetry and is much more flexible than the functional elliptical distribution family. The robustness of the PASS FPCA is demonstrated via extensive simulation studies, especially its advantages in scenarios with nonelliptical distributions. The proposed method was motivated by and applied to analysis of accelerometry data from the Objective Physical Activity and Cardiovascular Health Study, a large-scale epidemiological study to investigate the relationship between objectively measured physical activity and cardiovascular health among older women.  相似文献   

6.
郭连金 《西北植物学报》2014,34(9):1887-1893
采用相邻格子样方法采集数据,应用理论分布模型、分布系数法、Morisita格局指数法、聚集强度指数以及Greig-Smith区组均方方法研究了香果树幼苗种群空间格局,结果表明:(1)濒危植物香果树幼苗呈聚集分布,聚集强度受区组尺度的影响较大,随尺度的增加而减小;不同幼苗种群格局强度随着纬度增加,也呈现下降的趋势。(2)Greig-Smith区组均方法研究显示,香果树种群在8~16m2和50~64m2处聚集;香果树幼苗大小结构呈现出高度在0~40cm范围内幼苗较多现象,但经过强烈的环境筛选,大部分生长未达到120cm而死亡。(3)主成分分析表明,香果树幼苗受乔木盖度、灌木盖度、光照强度以及大气的温湿度影响较大。因此,建议加强香果树原生生境的保护,严禁对香果树幼苗进行滥砍、滥伐、放牧等破坏;通过降低乔木层和灌木层盖度,清理林下枯落物、苔藓等,增加林下光照强度,建立小面积林窗,促进香果树幼苗发育。  相似文献   

7.
Quantitative determination by fluorescence spectroscopy is possible because of the linear relationship between the intensity of emitted fluorescence and the fluorophore concentration. However, concentration quenching may cause the relationship to become nonlinear, and thus, the optimal dilution ratio has to be determined. In the case of fluorescence fingerprint (FF) measurement, fluorescence is measured under multiple wavelength conditions and a method of determining the optimal dilution ratio for multivariate data such as FFs has not been reported. In this study, the FFs of mixed solutions of tryptophan and epicatechin of different concentrations and composition ratios were measured. Principal component analysis was applied, and the resulting loading plots were found to contain useful information about each constituent. The optimal concentration ranges could be determined by identifying the linear region of the PC score plotted against total concentration.  相似文献   

8.
Hyoung-Tak  Im 《Plant Species Biology》1987,2(1-2):117-126
Abstract For the understanding of morphological differentiation and recognition of natural groups in the Saussurea nipponica complex, 440 individuals from 19 populations were examined, especially by using statistical methods. The variation range of 16 morphological characters within and between populations were analyzed not only separately but also synthetically by Duncan's multiple range test, principal component analysis, and cluster analysis. Of the 16 characters examined, characters concerning plant size (height and diameter of stem, size of involucre, etc.) and involucral bract (length of involucral bract and recurved part of involucral bract) are suggested to be important to recognize natural groups. Five groups are recognized by a complex pattern of the morphological characters. They can be defined multivariately as natural groups having indegenous habitat and distribution range, and considered as subspecies of S. nipponica.  相似文献   

9.
Using one male‐inherited and eight biparentally inherited microsatellite markers, we investigate the population genetic structure of the Valais chromosome race of the common shrew (Sorex araneus) in the Central Alps of Europe. Unexpectedly, the Y‐chromosome microsatellite suggests nearly complete absence of male gene flow among populations from the St‐Bernard and Simplon regions (Switzerland). Autosomal markers also show significant genetic structuring among these two geographical areas. Isolation by distance is significant and possible barriers to gene flow exist in the study area. Two different approaches are used to better understand the geographical patterns and the causes of this structuring. Using a principal component analysis for which testing procedure exists, and partial Mantel tests, we show that the St‐Bernard pass does not represent a significant barrier to gene flow although it culminates at 2469 m, close to the highest altitudinal record for this species. Similar results are found for the Simplon pass, indicating that both passes represented potential postglacial recolonization routes into Switzerland from Italian refugia after the last Pleistocene glaciations. In contrast with the weak effect of these mountain passes, the Rhône valley lowlands significantly reduce gene flow in this species. Natural obstacles (the large Rhône river) and unsuitable habitats (dry slopes) are both present in the valley. Moreover, anthropogenic changes to landscape structures are likely to have strongly reduced available habitats for this shrew in the lowlands, thereby promoting genetic differentiation of populations found on opposite sides of the Rhône valley.  相似文献   

10.
11.
    
Protein folds are built primarily from the packing together of two types of structures: alpha-helices and beta-sheets. Neither structure is rigid, and the flexibility of helices and sheets is often important in determining the final fold (e.g., coiled coils and beta-barrels). Recent work has quantified the flexibility of alpha-helices using a principal component analysis (PCA) of database helical structures (J. Mol. Bio. 2003, 327, pp. 229-237). Here, we extend the analysis to beta-sheet flexibility using PCA on a database of beta-sheet structures. For sheets of varying dimension and geometry, we find two dominant modes of flexibility: twist and bend. The distributions of amplitudes for these modes are found to be Gaussian and independent, suggesting that the PCA twist and bend modes can be identified as the soft elastic normal modes of sheets. We consider the scaling of mode eigenvalues with sheet size and find that parallel beta-sheets are more rigid than antiparallel sheets over the entire range studied. Finally, we discuss the application of our PCA results to modeling and design of beta-sheet proteins.  相似文献   

12.
13.
14.
In a preceding paper (M. Eger and R. Eckhorn, J. Comput. Neurosci., 2002) we have published a three step method for the quantification of transinformation in multi-input and -output neuronal systems. Here we present an extension that applies to rapid series of transient stimuli and thus, fills the gap between the discrete and continuous stimulation paradigm. While the three step method potentially captures all stimulus aspects, the present approach quantifies the discriminability of selected attributes of discrete stimuli and thus, assesses their encoding. Based on simulated and recorded data we investigate the performance of the implemented algorithm. Our approach is illustrated by analyses of neuronal population activity from the visual cortex of the cat, evoked by electrical stimuli of the retina.  相似文献   

15.
    
Identification of population structure can help trace population histories and identify disease genes. Structured association (SA) is a commonly used approach for population structure identification and association mapping. A major issue with SA is that its performance greatly depends on the informativeness and the numbers of ancestral informative markers (AIMs). Present major AIM selection methods mostly require prior individual ancestry information, which is usually not available or uncertain in practice. To address this potential weakness, we herein develop a novel approach for AIM selection based on principle component analysis (PCA), which does not require prior ancestry information of study subjects. Our simulation and real genetic data analysis results suggest that, with equivalent AIMs, PCA-based selected AIMs can significantly increase the accuracy of inferred individual ancestries compared with traditionally randomly selected AIMs. Our method can easily be applied to whole genome data to select a set of highly informative AIMs in population structure, which can then be used to identify potential population structure and correct possible statistical biases caused by population stratification.  相似文献   

16.
长江流域棉花综合群体品系的遗传多样性分析   总被引:1,自引:0,他引:1       下载免费PDF全文
以15个长江流域棉花杂交种为基础材料,花粉混合互交构建了综合群体,从中选育出29个棉花株系。通过田间试验对12个主要性状进行考察,衍生株系间霜前花率的变异最大,子棉产量及构成因素次之,纤维品质性状的变异最小。主成分分析表明,纤维品质、产量及构成因素、霜前花率、衣分和株高等前5个主成分,对变异方差的贡献率分别为24.312%、19.662%、13.287%、10.812%、9.085%。基于SSR的分子标记差异,绝大多数衍生株系聚在一类,遗传差异较小,明显区别于黄河流域棉花品种。  相似文献   

17.
    
The Neotropical genus Heliconius (Nymphalidae) is unique among butterflies for its pollen-feeding behaviour. With the application of saliva, they extract amino acids from pollen grains on the outside of the proboscis. We predicted that the salivary glands of pollen-feeding Heliconiinae would show adaptations to this derived feeding behaviour. A biometrical analysis of the salivary glands revealed that pollen-feeding butterflies of the genus Heliconius have disproportionately longer and more voluminous salivary glands than nonpollen-feeding Nymphalidae. The first two components in the principal component analysis explained approximately 95% of the total variance. The size-dependent factor score coefficients of body length and salivary gland parameters were predominately represented on axis 1. They significantly discriminated pollen-feeding from nonpollen-feeding heliconiines on that axis. Factor score coefficients for the volume of the secretory region of the salivary glands separated heliconiines from the outgroup species. The detailed biometrical analysis of salivary glands features thus provides strong evidence that the secretory regions of the salivary glands are larger in pollen-feeding butterflies. We concluded that pollen feeding is associated with a high production of salivary fluid.  © 2009 The Linnean Society of London, Biological Journal of the Linnean Society , 2009, 97 , 604–612.  相似文献   

18.
Genotype data from 20 microsatellites typed in 253 animals is used here to assess the genetic structure of seven European pedigree cattle breeds. Estimation of genetic subdivision using classical drift-based measures shows that the average proportion of genetic variation among breeds varies between 10 and 11% of the total, depending on the estimator used. We demonstrate that a simple allele-sharing genetic distance parameter can be used to construct a dendrogram of relationships among animals. This phylogenetic tree displays a remarkable degree of breed clustering and reflects an extensive underlying kinship structure, particularly for the Swiss Simmental breed and four breeds originating from the British Isles. Condensation of allele frequencies and individual genotypic compositions using principal component analysis is also used to investigate genetic structure among breeds and individual animals. In addition, the underlying genetic demarcation of European cattle breeds is emphasized in simulations of breed assignment using allele frequency distributions from samples of microsatellite loci. Correct breed designation can be inferred with accuracies approaching 100% using data from a panel of 10 microsatellite loci.  相似文献   

19.
20.
    
Advances in molecular “omics” technologies have motivated new methodologies for the integration of multiple sources of high-content biomedical data. However, most statistical methods for integrating multiple data matrices only consider data shared vertically (one cohort on multiple platforms) or horizontally (different cohorts on a single platform). This is limiting for data that take the form of bidimensionally linked matrices (eg, multiple cohorts measured on multiple platforms), which are increasingly common in large-scale biomedical studies. In this paper, we propose bidimensional integrative factorization (BIDIFAC) for integrative dimension reduction and signal approximation of bidimensionally linked data matrices. Our method factorizes data into (a) globally shared, (b) row-shared, (c) column-shared, and (d) single-matrix structural components, facilitating the investigation of shared and unique patterns of variability. For estimation, we use a penalized objective function that extends the nuclear norm penalization for a single matrix. As an alternative to the complicated rank selection problem, we use results from the random matrix theory to choose tuning parameters. We apply our method to integrate two genomics platforms (messenger RNA and microRNA expression) across two sample cohorts (tumor samples and normal tissue samples) using the breast cancer data from the Cancer Genome Atlas. We provide R code for fitting BIDIFAC, imputing missing values, and generating simulated data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号