共查询到20条相似文献,搜索用时 8 毫秒
1.
A standard multivariate principal components (PCs) method was utilized to identify clusters of variables that may be controlled by a common gene or genes (pleiotropy). Heritability estimates were obtained and linkage analyses performed on six individual traits (total cholesterol (Chol), high and low density lipoproteins, triglycerides (TG), body mass index (BMI), and systolic blood pressure (SBP)) and on each PC to compare our ability to identify major gene effects. Using the simulated data from Genetic Analysis Workshop 13 (Cohort 1 and 2 data for year 11), the quantitative traits were first adjusted for age, sex, and smoking (cigarettes per day). Adjusted variables were standardized and PCs calculated followed by orthogonal transformation (varimax rotation). Rotated PCs were then subjected to heritability and quantitative multipoint linkage analysis. The first three PCs explained 73% of the total phenotypic variance. Heritability estimates were above 0.60 for all three PCs. We performed linkage analyses on the PCs as well as the individual traits. The majority of pleiotropic and trait-specific genes were not identified. Standard PCs analysis methods did not facilitate the identification of pleiotropic genes affecting the six traits examined in the simulated data set. In addition, genes contributing 20% of the variance in traits with over 0.60 heritability estimates could not be identified in this simulated data set using traditional quantitative trait linkage analyses. Lack of identification of pleiotropic and trait-specific genes in some cases may reflect their low contribution to the traits/PCs examined or more importantly, characteristics of the sample group analyzed, and not simply a failure of the PC approach itself. 相似文献
2.
Principal component analysis enhanced by the use of smoothing is used in conjunction with discriminant analysis techniques to devise a statistical classification method for the analysis of event-related potential data. A training set of premedication potentials collected from adolescents with attention-deficit hyperactive disorder (ADHD) is used to predict whether adolescents from an independent subject group will respond to long-term medication. Comparison of outcome prediction rates demonstrates that this method, which uses information from the whole ERP curve, is superior to the classification technique currently used by clinicians, which is based on a single ERP curve feature. The need to administer an initial dose of medication to classify patients is also eliminated. 相似文献
3.
Influence in principal components analysis 总被引:4,自引:0,他引:4
4.
Local influence in principal components analysis 总被引:5,自引:0,他引:5
5.
6.
Despite the significant advances made over the last few years in mapping inversions with the advent of paired-end sequencing approaches, our understanding of the prevalence and spectrum of inversions in the human genome has lagged behind other types of structural variants, mainly due to the lack of a cost-efficient method applicable to large-scale samples. We propose a novel method based on principal components analysis (PCA) to characterize inversion polymorphisms using high-density SNP genotype data. Our method applies to non-recurrent inversions for which recombination between the inverted and non-inverted segments in inversion heterozygotes is suppressed due to the loss of unbalanced gametes. Inside such an inversion region, an effect similar to population substructure is thus created: two distinct "populations" of inversion homozygotes of different orientations and their 1:1 admixture, namely the inversion heterozygotes. This kind of substructure can be readily detected by performing PCA locally in the inversion regions. Using simulations, we demonstrated that the proposed method can be used to detect and genotype inversion polymorphisms using unphased genotype data. We applied our method to the phase III HapMap data and inferred the inversion genotypes of known inversion polymorphisms at 8p23.1 and 17q21.31. These inversion genotypes were validated by comparing with literature results and by checking Mendelian consistency using the family data whenever available. Based on the PCA-approach, we also performed a preliminary genome-wide scan for inversions using the HapMap data, which resulted in 2040 candidate inversions, 169 of which overlapped with previously reported inversions. Our method can be readily applied to the abundant SNP data, and is expected to play an important role in developing human genome maps of inversions and exploring associations between inversions and susceptibility of diseases. 相似文献
7.
Background
In microarray data analysis, the comparison of gene-expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large datasets. Less work has been published concerning the assessment of the reliability of gene-selection procedures. Here we describe a method to assess reliability in multivariate microarray data analysis using permutation-validated principal components analysis (PCA). The approach is designed for microarray data with a group structure. 相似文献8.
It has long been recognized that tooth crown diameters in hominoids are all positively intercorrelated one with another. This study reports on sex-specific correlation matrices derived from 2,650 individuals from the Solomon Islands, Melanesia. Mesiodistal and buccolingual diameters of all permanent teeth from one side are used, excluding third molars. Analysis discloses significant sex dimorphism in the strengths of the intercorrelations, with females being better integrated. Principal components analysis (PCA) provides an objective means of data reduction (shown here to be preferable to simple size summation methods) and decorrelation of the resulting linear combinations. Four components are extracted (with results being virtually identical in the two sexes) and arguments are put forth that varimax rotation to "a simpler solution" may be counterproductive. Before rotation, the four components are 1) overall size, 2) buccolingual widths contrasted with mesiodistal lengths, 3) anterior (I,C) contrasted with posterior (P,M) teeth, and 4) premolars contrasted with molars. Most of the explained (shared) variance (63%) extracted by PCA is in overall size of the dentition. There is a strong urge to view the results of these principal components analyses as reflective of biologically and genetically meaningful entities. 相似文献
9.
A study is made of the dune system at Tentsmuir Point National Nature Reserve, Scotland, using transects crossing the vegetation zonation. Principal Components Analysis and tabular ordination are used to analyse the data, and an attempt is made to relate the results obtained to the dynamics of the system. The effects of different management regimes are considered, and it is concluded that the establishment of pine on the area has the largest effect on the development of the vegetation. Reduction in grazing pressure by rabbits is found to increase species diversity slightly, but has no major influence as yet on vegetation development. While some information on the dynamics of the vegetation can be inferred, the problems involved in this are considered to be large, and the study raises a number of questions to be studied in greater detail. It is concluded that permanent plots would be the most effective method to employ.Nomenclature follows Clapham, Tutin & Warburg (1962) for vascular plants, Watson (1968) for bryophytes, and Duncan (1970) for lichens.We should like to thank Dr. R. A. H. Smith of the Nature Conservancy Council for her assistance and permission to work on Tentsmuir Point N. N. R., and Mr. J. G. Young, then warden at Tentsmuir for his help at the start of the project. We are grateful to Dr. R. Meutzelfeldt for computational advice. In addition, one of us (R.J.H.) would like to thank Dr. E. van der Maarel and Dr. R. S. Clymo for their tuition during the Nordic Council for Ecology course Numerical Methods in Vegetation Analysis' in Lund, September, 1979. 相似文献
10.
A study of Vibrio anguillarum from farmed and wild fish using principal components analysis 总被引:1,自引:0,他引:1
Principal components analysis revealed two main groups among 163 Vibrio anguillarum cultures from diseased fish from Norwegian waters. Nearly all isolates from farmed salmonids fell in group I (arabinose positive) but those from wild fish, particularly saithe Gadus virens , more commonly appeared in group II (arabinose negative). 相似文献
11.
Many recent approaches to decoding neural spike trains depend critically on the assumption that for low-pass filtered spike
trains, the temporal structure is optimally represented by a small number of linear projections onto the data. We therefore tested this assumption of linearity by comparing a linear factor analysis technique
(principal components analysis) with a nonlinear neural network based method. It is first shown that the nonlinear technique
can reliably identify a neuronally plausible nonlinearity in synthetic spike trains. However, when applied to the outputs
from primary visual cortical neurons, this method shows no evidence for significant temporal nonlinearities. The implications
of this are discussed.
Received: 29 November 1996 / Accepted in revised form: 1 July 1997 相似文献
12.
Catherine Cullingham Rhiannon M. Peery Joshua M. Miller 《Molecular ecology resources》2023,23(3):519-522
Identification of population structure is a common goal for a variety of applications, including conservation, wildlife management, and medical genetics. The outcome of these analyses can have far reaching implications; therefore, it is important to ensure an understanding of the strengths and weaknesses of the methodologies used. Increasing in popularity, the discriminant analysis of principal components (DAPC) method incorporates combinations of genetic variables (alleles) into a model that differentiates individuals into genetic clusters. However, users may not have a full understanding of how to best parameterize the model. In this issue of Thia (Molecular Ecology Resources, 2022) looks under the hood of the DAPC. Using simulated data, he demonstrates the importance of careful parameter selection in developing a DAPC model, what the implications are for over-fitting the model, and finally, how best to evaluate the results of DAPC models. This work highlights the issues that can arise when over-parameterizing the DAPC model when gene flow is high among clusters and provides important guidelines to ensure researchers are making conclusions that are biologically relevant. 相似文献
13.
The purpose of many microarray studies is to find the association between gene expression and sample characteristics such as treatment type or sample phenotype. There has been a surge of efforts developing different methods for delineating the association. Aside from the high dimensionality of microarray data, one well recognized challenge is the fact that genes could be complicatedly inter-related, thus making many statistical methods inappropriate to use directly on the expression data. Multivariate methods such as principal component analysis (PCA) and clustering are often used as a part of the effort to capture the gene correlation, and the derived components or clusters are used to describe the association between gene expression and sample phenotype. We propose a method for patient population dichotomization using maximally selected test statistics in combination with the PCA method, which shows favorable results. The proposed method is compared with a currently well-recognized method. 相似文献
14.
We present robust estimators for the mean and the principalcomponents of a stochastic process in . Robustness and asymptotic properties of theestimators are studied theoretically, by simulation and by example.It is shown that the proposed estimators are generally morerobust to outliers than the commonly used sample mean and principalcomponents, although their properties depend on the spacingsof the eigenvalues of the covariance function. 相似文献
15.
We propose a modelling framework to study the relationship betweentwo paired longitudinally observed variables. The data for eachvariable are viewed as smooth curves measured at discrete time-pointsplus random errors. While the curves for each variable are summarizedusing a few important principal components, the associationof the two longitudinal variables is modelled through the associationof the principal component scores. We use penalized splinesto model the mean curves and the principal component curves,and cast the proposed model into a mixed-effects model frameworkfor model fitting, prediction and inference. The proposed methodcan be applied in the difficult case in which the measurementtimes are irregular and sparse and may differ widely acrossindividuals. Use of functional principal components enhancesmodel interpretation and improves statistical and numericalstability of the parameter estimates. 相似文献
16.
A method for simultaneous, nondestructive analysis of aminopyrine and phenacetin in compound aminopyrine phenacetin tablets with different concentrations has been developed by principal component artificial neural networks (PC-ANNs) on near-infrared (NIR) spectroscopy. In PC-ANN models, the spectral data were initially analyzed by principal component analysis. Then the scores of the principal components were chosen as input nodes for the input layer instead of the spectral data. The artificial neural network models using the spectral data as input nodes were also established and compared with the PC-ANN models. Four different preprocessing methods (first-derivative, second-derivative, standard normal variate (SNV), and multiplicative scatter correction) were applied to three sets of NIR spectra of compound aminopyrine phenacetin tablets. The PC-ANNs approach with SNV preprocessing spectra was found to provide the best results. The degree of approximation was performed as the selective criterion of the optimum network parameters. 相似文献
17.
18.
Scott Nichols 《Plant Ecology》1997,34(3):191-197
Summary Principal components analysis is well suited for many data analysis problems in ecology, particularly for data reduction and hypothesis generation; but the structure of PCA is poorly suited for indirect gradient analysis. Whatever the intended application of PCA, the user must exercise special care in selecting data transformations to prevent the analysis from being overwhelmed by the purely numerical effects in the variance structure of the data.I would like to thank R. H. Whittaker, H. G. Gauch, R. E. Moeller, and S. R. Searle for their guidance and assistance. 相似文献
19.
Exequiel Ezcurra 《Plant Ecology》1987,71(1):41-47
Non-centred Principal Components Analysis (NPCA) ordinates sites and species simultaneously, and can be solved either by direct iteration or by eigenvector calculation. The weight of sites and species in the analysis is proportional to their overall abundance. Because of this, the method is not susceptible to distortion by rare species, as is the case with Reciprocal Averaging (RA). Detrending techniques can also be applied to this method to eliminate arch effects.When NPCA was tried with field data, it produced ordination axes that were significantly associated to independently measured environmental variables. In contrast, RA failed to produce axes related to environmental factors, even after the main rare species had been eliminated from the analysis.Abbreviations NPCA
Non-centred Principal Components Analysis
- RA
Reciprocal Averaging 相似文献
20.
The immense volume and rapid growth of human genomic data, especially single nucleotide polymorphisms (SNPs), present special challenges for both biomedical researchers and automatic algorithms. One such challenge is to select an optimal subset of SNPs, commonly referred as "haplotype tagging SNPs" (htSNPs), to capture most of the haplotype diversity of each haplotype block or gene-specific region. This information-reduction process facilitates cost-effective genotyping and, subsequently, genotype-phenotype association studies. It also has implications for assessing the risk of identifying research subjects on the basis of SNP information deposited in public domain databases. We have investigated methods for selecting htSNPs by use of principal components analysis (PCA). These methods first identify eigenSNPs and then map them to actual SNPs. We evaluated two mapping strategies, greedy discard and varimax rotation, by assessing the ability of the selected htSNPs to reconstruct genotypes of non-htSNPs. We also compared these methods with two other htSNP finders, one of which is PCA based. We applied these methods to three experimental data sets and found that the PCA-based methods tend to select the smallest set of htSNPs to achieve a 90% reconstruction precision. 相似文献