首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 238 毫秒
1.
人类群体遗传结构的协方差阵主成分分析方法   总被引:3,自引:0,他引:3  
目的:探讨基因频率矩阵的中心化(或均值化)协方差阵主成分分析方法在人类群体遗传结构研究中的适用性和合理性。方法:从基因频率矩阵的结构特征入手,分析中心化、均值化协方差阵主成分分析与标准化相关阵主成分分析在特征根、特征向量以及降维效果等方面的差异,并通过实例比较不同方法在解释群体遗传结构特征上合理性。结果:中心化(或均值化)协方差阵的主成分不仅反映了基因变异程度的“方差信息量权”,而且反映了基因间相互影响程度的“相关信息量权”;标准化相关阵的主成分反映的仅是“相关信息量权”,不包括“方差信息量权”。通过比较中国26个汉族人群HLA-A基因座中心化协方差阵和标准化相关阵2种主成分分析结果,证实中心化协方差阵主成分分析方法在特征根与特征向量、保留主成分的个数和对主成分的群体遗传学解释的合理性等方面均优于标准化相关阵主成分分析方法。结论:在对群体遗传结构进行主成分分析时,应使用中心化(或均值化)变换消除基因频率矩阵中量级的影响,然后在用其协方差阵提取主成分。  相似文献   

2.
桂宏胜  杨丽  李生斌 《遗传》2007,29(12):1443-1148
STR作为遗传多态性较高的标记, 被广泛地运用于群体遗传学的研究。对于STR分型产生的基因型频率及等位基因频率数据, 文章总结了各种参数指标的计算及分析方法。其中参数指标包括杂合度、多态信息量、连锁不平衡系数、近交系数、遗传距离以及固定指数等; 分析方法包括主成分分析、系统发生树、分子方差分析、R矩阵、地理信息系统以及空间自相关分析。通过这些参数指标及分析方法的使用, 可以既直观又科学地揭示群体遗传结构、群体间遗传分化以及人类起源与进化等群体遗传学中研究的关键问题。  相似文献   

3.
人类群体遗传空间结构的"克立格"模型   总被引:3,自引:0,他引:3  
通过将“克立格”技术应用于人类群体遗传学领域,构建了人类群体遗传空间结构的“克立格”模型,并论述了其原理和计算方法。以HLA-A基因座为例,应用“克立格”模型,定量分析了中国人群HLA-A基因座的空间遗传异质性;对HLA-A基因频率的空间数据矩阵进行了主成分分析,进而定义了人类群体遗传结构的综合遗传测度(SPC),绘制了综合遗传测度和主成分(PC)的“克立格”地图,分析了其群体遗传空间结构特性。与其他空间插值或平滑方法相比,人类群体遗传空间结构的“克立格”模型具有明显优点:1)“克立格”估计以空间遗传变异函数模型为基础,在绘制空间遗传结构地图之前,可利用变异函数模型定量分析所研究基因座(或多基因座)的空间遗传异质性;2)“克立格”插值方法是真正意义上的无偏估计模型,它利用待估区域周围的已知群体遗传调查点数据,并充分考虑调查点的空间影响范围,给出待估区域的最优估计值;3)“克立格”模型允许估计插值误差,这种插值误差既可用于评价空间估计效果,又可通过绘制误差地图指导在误差过高的地点增加新的群体遗传调查样本点,以优化估计效果。然而,人类群体遗传空间结构的“克立格”模型也存在一定缺点:1)若不能用任何理论遗传变异函数模型拟合观察遗传变异函数值,则不能建立“克立格”模型;2)若理论遗传变异函数的拟合优度很低,则据此建立的“克立格”模型的估计标准差在整个空间范围内会很大,此时“克立格”模型不适用于估计群体遗传空间结构。出现上述两种情形时,应选用不考虑空间相关性的空间随机插值方法绘制群体遗传结构地图,如基因绘图软件中的Cavalli-Sforza方法,反向距离加权法和条样函数插值法等。  相似文献   

4.
了解粒用高粱的遗传多样性和群体结构,能有效提高粒用高粱新品种的选育效率。本研究利用基因分型测序技术(GBS,genotyping by sequencing)对120份粒用高粱材料开展了全基因组基因分型,共获得了3456个多态性的SNP标记,其多态性信息含量指数(PIC,polymorphism information content)介于0.013~0.574之间,平均值为0.381。根据SNP标记在120份高粱材料中的基因分型数据,计算了材料间的遗传距离,其变异范围为0.084~0.613,平均遗传距离为0.365。群体进化树分析和主成分分析都将120份高粱材料划分为3个类群,类群1主要由包括美国材料MN-3609在内的亲缘关系较远的高粱材料组成,类群2主要由中国北方的高粱材料组成,类群3主要由中国南方的高粱材料组成。群体结构分析表明,当K=3时,ΔK取得最大值,说明120份高粱材料可以划分为3个类群,其划分结果与群体进化树分析和主成分分析基本一致。本研究从基因型多样性水平上阐释了粒用高粱的遗传背景和群体结构,为中国粒用高粱新品种的选育提供了理论依据。  相似文献   

5.
植物数量性状遗传体系的分离分析方法研究   总被引:67,自引:2,他引:65  
盖钧镒 《遗传》2005,27(1):130-136
在传统的数量性状多基因遗传模型基础上提出主基因-多基因遗传模型具普遍性,纯主基因或纯多基因遗传模型只是其特例。由此初步建立了植物数量性状遗传体系分离分析方法。目前该方法可以检验2~3个主基因的个别遗传效应、多基因整体的遗传效应和两者的遗传率。本文介绍这种分离分析方法的研究经过、主要进展及应用效果,并以实例说明其分析步骤、方法和效果。  相似文献   

6.
薛付忠  王洁贞  郭亦寿  胡平 《遗传》2005,27(6):972-979
探讨了人类群体遗传结构对应分析中“蹄型效应”的产生机制及其遗传学解释。从分析基因频率矩阵的结构特点入手,以实例验证和比较了对应分析中散点图的结构特征。发现当基因频率矩阵的结构不同时,其对应分析中散点图的分布模式不同;当基因频率矩阵中存在稀有基因时,其对应分析的散点图则呈现明显的“蹄型效应”。“蹄型效应”经常会歪曲潜在遗传结构的真实形态,其产生主要是因为对应分析中的c2距离不相似测度高估了稀有基因的作用。在人类群体遗传结构对应分析中,当出现“蹄型效应”效应时,需认真分析基因频率矩阵的结构,寻找“蹄型效应”产生原因并给出合理的遗传学解释,以免做出错误结论。  相似文献   

7.
从母系遗传的角度揭示世居贵州的侗族、仡佬族、土家族和彝族人群的的遗传结构和遗传分化关系,并对各民族的族源和迁徙进行初步的探讨。采用高变区序列分析与编码区PCR-RFLP分析相结合的方法对4个群体108例样本进行mtDNA多态性分析,共鉴定了37种(亚)单倍群,单倍群分布频率及主成分分析显示:侗族含有高比例的南方优势单倍群,表现出典型的南方群体特征;彝族兼有高比例的南北方优势单倍群,提示它同时具有南北方群体的一些母系遗传特征;彝族和仡佬族聚在一起,可能是由于历史上两个民族的先民曾发生过广泛的基因交流。  相似文献   

8.
苗永美  隋益虎  简兴 《广西植物》2015,35(5):704-708
为了解黄瓜雄花花器的遗传特性,该研究以雄花器官较小的华南型黄瓜二早子为母本,花器较大的加工型黄瓜NC-76为父本,构建4世代遗传群体,并采用多世代联合分离分析方法,分析黄瓜雄花花器性状的遗传特性。结果表明:分离群体的雄花花梗和花冠长2个性状均表现为单峰分布,表明两性状为数量性状且有主基因控制;花梗长性状符合2对完全显性主基因+加性-显性多基因(E-5)模型,花冠长性状符合2对加性-显性-上位性主基因+加性-显性-上位性多基因(E-1)模型;控制花梗长性状的两对主基因的加性效应相等,为0.573,多基因的加性效应和显性效应值相差不大,且均为负向;控制花冠长度性状的2对主基因的加性效应均为0,显性效应分别为-0.226和-0.472,在上位性作用中以加性×加性和显性×显性互作为主,多基因以显性效应为主,正向显性效应值为0.613,大于负向的加性效应值。花梗和花冠长度两个性状在F2群体中主基因遗传率分别为61.04%和69.60%,多基因遗传率均为0。由此看出黄瓜雄花花器性状为数量遗传,遗传率相对较高。该研究结果显示在黄瓜杂交育种中对花器大小选择可以在较早世代选择。  相似文献   

9.
利用DH或RIL群体检测QTL体系并估计其遗传效应   总被引:39,自引:1,他引:38  
章元明  盖钧镒 《遗传学报》2000,27(7):634-640
利用DH和RIKL群体并结合重复内分组随机区组设计对和物产量等遗传率较低的数量性状进行分离分析可提高遗传分析的精度。根据混合分布理论菜了利用DH或RIL群体重复实验数据鉴定数量性状混合遗传模型的分离分析法,特别是2对链锁主基因+多基因模型。该方法可鉴定数量性状的遗传模型和主基因的作用方式,估计主基因、多基因的遗传疚和遗传方差,在两主基因存在连锁可可估计其重组率。下面通过应用举例说明该方法。  相似文献   

10.
以油用品种陇亚8号和纤用品种阿里安为亲本构建的含有162个家系的胡麻重组自交系(Recombinant inbred lines,RIL)为研究材料,利用气相色谱法测定了该RIL群体的脂肪酸含量,对其遗传变异与分布特征进行了分析,并应用主基因+多基因混合遗传模型,对粗脂肪和5种脂肪酸含量进行了初步遗传分析,旨在为该RIL群体后续研究利用提供参考。结果表明,RIL群体的粗脂肪与脂肪酸含量存在广泛变异,表现超亲分离现象,其分布近似为正态分布,呈现数量性状连续变异的典型分布特征;运用主基因+多基因遗传模型分析结果表明,粗脂肪含量为3对等加性主基因遗传,主基因遗传率为85%;5种脂肪酸组成中,亚麻酸含量为2对重叠作用主基因遗传,主基因遗传率为36%;亚油酸含量为3对等加性主基因遗传,主基因遗传率为80%;油酸、棕榈酸和硬脂酸含量均表现为无主基因效应的多基因遗传;同时筛选出高油高亚油酸、高亚麻酸优良品系材料11份,为胡麻品质育种提供了新的材料。  相似文献   

11.
Increasing attention is being devoted to taking landscape information into account in genetic studies. Among landscape variables, space is often considered as one of the most important. To reveal spatial patterns, a statistical method should be spatially explicit, that is, it should directly take spatial information into account as a component of the adjusted model or of the optimized criterion. In this paper we propose a new spatially explicit multivariate method, spatial principal component analysis (sPCA), to investigate the spatial pattern of genetic variability using allelic frequency data of individuals or populations. This analysis does not require data to meet Hardy-Weinberg expectations or linkage equilibrium to exist between loci. The sPCA yields scores summarizing both the genetic variability and the spatial structure among individuals (or populations). Global structures (patches, clines and intermediates) are disentangled from local ones (strong genetic differences between neighbors) and from random noise. Two statistical tests are proposed to detect the existence of both types of patterns. As an illustration, the results of principal component analysis (PCA) and sPCA are compared using simulated datasets and real georeferenced microsatellite data of Scandinavian brown bear individuals (Ursus arctos). sPCA performed better than PCA to reveal spatial genetic patterns. The proposed methodology is implemented in the adegenet package of the free software R.  相似文献   

12.
H Gao  T Zhang  Y Wu  Y Wu  L Jiang  J Zhan  J Li  R Yang 《Heredity》2014,113(6):526-532
Given the drawbacks of implementing multivariate analysis for mapping multiple traits in genome-wide association study (GWAS), principal component analysis (PCA) has been widely used to generate independent ‘super traits'' from the original multivariate phenotypic traits for the univariate analysis. However, parameter estimates in this framework may not be the same as those from the joint analysis of all traits, leading to spurious linkage results. In this paper, we propose to perform the PCA for residual covariance matrix instead of the phenotypical covariance matrix, based on which multiple traits are transformed to a group of pseudo principal components. The PCA for residual covariance matrix allows analyzing each pseudo principal component separately. In addition, all parameter estimates are equivalent to those obtained from the joint multivariate analysis under a linear transformation. However, a fast least absolute shrinkage and selection operator (LASSO) for estimating the sparse oversaturated genetic model greatly reduces the computational costs of this procedure. Extensive simulations show statistical and computational efficiencies of the proposed method. We illustrate this method in a GWAS for 20 slaughtering traits and meat quality traits in beef cattle.  相似文献   

13.
Using 11 microsatellite markers, we investigated the allelic variation and genetic structure of Cryptomeria japonica, across most of its natural distribution. The markers displayed high levels of polymorphism (average gene diversity=0.77, average number of alleles=24.0), in sharp contrast to the lower levels of polymorphism found in allozyme and cleaved amplified polymorphic sequence markers in previous studies. Little genetic differentiation was found among populations (FST=0.028, P<0.001), probably because the species is wind-pollinated and long-lived. No clear relationship between Neis genetic distances and geographical locations of the populations were found using the principal coordinate and unweighted pair-group method with arithmetic averaging analyses. The lack of such trends might be due partly to microsatellite homoplasy arising from mutation blurring the genealogical record. However, there was a trend towards high allelic diversity in five populations (Ashitaka, Ashiu, Oki-Island, Yakushima-Island-1 and -2), which are very close to, or in, refugial areas of the last glacial period as defined by Tsukada based on pollen analysis data and current climatic divisions. We postulate that these refugial populations might have been less affected by genetic drift than the other populations due to their relatively large size.  相似文献   

14.
Applications of quantitative techniques to understanding macroevolutionary patterns typically assume that genetic variances and covariances remain constant. That assumption is tested among 28 populations of the Phyllotis darwini species group (leaf-eared mice). Phenotypic covariances are used as a surrogate for genetic covariances to allow much greater phylogenetic sampling. Two new approaches are applied that extend the comparative method to multivariate data. The efficacy of these techniques are compared, and their sensitivity to sampling error examined. Pairwise matrix correlations of correlation matrices are consistently very high (> 0.90) and show no significant association between matrix similarity and phylogenetic relatedness. Hierarchical decomposition of common principal component (CPC) analyses applied to each clade in the phylogeny rejects the hypothesis that common principal component structure is shared in clades more inclusive than subspecies. Most subspecies also lack a common covariance structure as described by the CPC model. The hypothesis of constant covariances must be rejected, but the magnitudes of divergence in covariance structure appear to be small. Matrix correlations are very sensitive to sampling error, while CPC is not. CPC is a powerful statistical tool that allows detailed testing of underlying patterns of covariation.  相似文献   

15.
The vibrational circular dichroism (VCD) spectra of 20 proteins dissolved in D2O are presented in the amide I' region. These data are decomposed into a linear combination of orthogonal subspectra generated by the principal component method of factor analysis, and the results for 13 of them are compared to their secondary structures as determined from X-ray crystallography. Factor analysis of the VCD yields six statistically significant subspectra that can be used to reproduce the spectra. Their coefficients can then be used to characterize a given protein. Comparison of cluster analyses of these VCD coefficients and of the secondary structure fractional coefficients from X-ray crystallography showed that proteins clustered in the VCD analysis were also clustered in the X-ray analysis. The relative fractions of alpha-helix and beta-sheet in the protein dominate the clustering in both data sets. Qualitative characterization of the secondary structure of a given protein is obtained from its clustering on the basis of spectral characteristics. A strong linear correlation was found between the coefficient of the second subspectrum and the alpha-helical fraction for the proteins studied. The second coefficient also correlated to the beta-sheet fraction, and the first coefficient weakly correlated to the fraction for "other". Subsequent multiple-parameter regression analyses of the VCD factor analysis coefficients, constrained to include only significant dependencies, yielded reliable determination of the alpha-helix fraction and somewhat less confident determination of beta-sheet, bend, and "other" components. Predictive capability for proteins not in the regression was good. Varimax rotation of the coefficients transformed the subspectra and gave simple correlations to secondary structure components but had less reliability and more restrictions than the multiple regression on the original coefficients. The partial least-squares analysis method was also used to predict fractional secondary structures for the training set proteins but resulted in somewhat higher average error, particularly for beta-sheet, than the multiple regression. The turn fraction was effectively undetermined in both the regression and partial least-squares analyses. These statistical analyses represent the first determination of a quantitative relationship between VCD spectra and secondary structure in proteins.  相似文献   

16.
Working with weakly congruent markers means that consensus genetic structuring of populations requires methods explicitly devoted to this purpose. The method, which is presented here, belongs to the multivariate analyses. This method consists of different steps. First, single-marker analyses were performed using a version of principal component analysis, which is designed for allelic frequencies (%PCA). Drawing confidence ellipses around the population positions enhances %PCA plots. Second, a multiple co-inertia analysis (MCOA) was performed, which reveals the common features of single-marker analyses, builds a reference structure and makes it possible to compare single-marker structures with this reference through graphical tools. Finally, a typological value is provided for each marker. The typological value measures the efficiency of a marker to structure populations in the same way as other markers. In this study, we evaluate the interest and the efficiency of this method applied to a European and African bovine microsatellite data set. The typological value differs among markers, indicating that some markers are more efficient in displaying a consensus typology than others. Moreover, efficient markers in one collection of populations do not remain efficient in others. The number of markers used in a study is not a sufficient criterion to judge its reliability. "Quantity is not quality".  相似文献   

17.
Strategies for genetic mapping of categorical traits   总被引:3,自引:0,他引:3  
Shaoqi Rao  Xia Li 《Genetica》2000,109(3):183-197
The search for efficient and powerful statistical methods and optimal mapping strategies for categorical traits under various experimental designs continues to be one of the main tasks in genetic mapping studies. Methodologies for genetic mapping of categorical traits can generally be classified into two groups, linear and non-linear models. We develop a method based on a threshold model, termed mixture threshold model to handle ordinal (or binary) data from multiple families. Monte Carlo simulations are done to compare its statistical efficiencies and properties of the proposed non-linear model with a linear model for genetic mapping of categorical traits using multiple families. The mixture threshold model has notably higher statistical power than linear models. There may be an optimal sampling strategy (family size vs number of families) in which genetic mapping reaches its maximal power and minimal estimation errors. A single large-sibship family does not necessarily produce the maximal power for detection of quantitative trait loci (QTL) due to genetic sampling of QTL alleles. The QTL allelic model has a marked impact on efficiency of genetic mapping of categorical traits in terms of statistical power and QTL parameter estimation. Compared with a fixed number of QTL alleles (two or four), the model with an infinite number of QTL alleles and normally distributed allelic effects results in loss of statistical power. The results imply that inbred designs (e.g. F2 or four-way crosses) with a few QTL alleles segregating or reducing number of QTL alleles (e.g. by selection) in outbred populations are desirable in genetic mapping of categorical traits using data from multiple families. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

18.
BACKGROUND AND AIMS: The genetic and morphological variation in the sago palm (Metroxylon sagu, Arecaceae) in Papua New Guinea (PNG) was investigated. METHODS: Amplified fragment length polymorphism (AFLP) was used to investigate the genetic structure of 76 accessions of M. sagu, collected in seven wild and semi-wild stands in PNG. KEY RESULTS: An analysis of ten quantitative morphological variables revealed that most of these were mutually correlated. Principal component analyses of the same morphological variables showed that neither armature (presence or absence of spines) nor geographical separation was reflected clearly in the quantitative morphological variation. Similarity matrices of genetic, quantitative morphological, geographical and armature data were tested for pair-wise correlations, using Mantel's test. The results only showed a significant correlation between genetic and geographical distances. Visual inspection of principal component analyses plots and a neighbour-joining dendrogram based on genetic distances supported this trend, whereas armature showed no relation with genetic distances. CONCLUSIONS: Geographical distribution defines some weak patterns in the genetic variation, whereas the genetic variation does not reflect any patterns in the morphological variation, including armature. The present study supports the accepted taxonomy of M. sagu, recognizing only one species of M. sagu in PNG.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号