首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Statistical Properties of a DNA Sample under the Finite-Sites Model   总被引:1,自引:0,他引:1       下载免费PDF全文
Z. Yang 《Genetics》1996,144(4):1941-1950
Statistical properties of a DNA sample from a random-mating population of constant size are studied under the finite-sites model. It is assumed that there is no migration and no recombination occurs within the locus. A Markov process model is used for nucleotide substitution, allowing for multiple substitutions at a single site. The evolutionary rates among sites are treated as either constant or variable. The general likelihood calculation using numerical integration involves intensive computation and is feasible for three or four sequences only; it may be used for validating approximate algorithms. Methods are developed to approximate the probability distribution of the number of segregating sites in a random sample of n sequences, with either constant or variable substitution rates across sites. Calculations using parameter estimates obtained for human D-loop mitochondrial DNAs show that among-site rate variation has a major effect on the distribution of the number of segregating sites; the distribution under the finite-sites model with variable rates among sites is quite different from that under the infinite-sites model.  相似文献   

3.
4.
This paper is motivated by a practical problem relating to student performance in a number of subjects of equal standing. Its mathematical formulation is to find an approximation to a multivariate probability of the form Pr {X1a, X2a, …, XNa} for arbitrary a and N, in terms of p = Pr {X1a} and q = Corr (Xi, Xj), ij, where Xi, i = 1, …, N are exchangeable random variables with mean 0 and variance unity.  相似文献   

5.
研究一种非齐态多臂临床试验的GFU模型,并建立了模型自适应设计.对这种模型,构造了参数的估计量,并获得了相应量的强相合性,收敛速率及其渐近正态性.这些结果对临床试验设计的应用提供了一定的理论依据.  相似文献   

6.
7.
研究了艾滋病传播的随机模型.我们给出了一个条件,在这个条件下,无论环境干扰有多强,随机Logistic方程的平凡解是渐近稳定的.我们的结论与Roberts和Saha的结论构成对艾滋病传播模型的一个完整的分析.  相似文献   

8.
9.
考虑了一类恢复率受到噪声影响的随机SIR流行病模型.首先证明了模型非负解的全局存在惟一性;其次证明了当基本再生数R0≤1时无病平衡点随机渐近稳定,当R0>1时随机模型的解围绕确定性模型地方病平衡点震荡.最后通过数值仿真验证了所得结论的正确性.  相似文献   

10.
一类具阶段结构的捕食者-食饵模型的渐进性质   总被引:1,自引:0,他引:1  
研究了一类具阶段结构的捕食者-食饵模型的渐近性质.文中假设由幼年阶段转化为成年阶段的转化率依赖于幼年个体数量.建立了捕食种群一致持续生存与绝灭的条件.证明了稳定的周期解的存在性.  相似文献   

11.
研究了具有时滞的Schoener竞争模型,通过构造一个合理的lyapunov函数,在一定条件下得到了该系统正解的全局稳定性.  相似文献   

12.
针对多重二元响应Probit模型提出了两步估计方法,第一步由边际似然得到参数√n相合的估计,第二步通过一步迭代得到渐近有效估计,由于只需一步迭代,因此在利用模拟方法计算信息阵时,可以增加模拟的次数,从而减少模拟所产生的扰动对估计的影响.  相似文献   

13.
14.
Soil samples have to be stored during transportation and investigation in the laboratory if they are not analyzed directly at the site. Existing standards and investigations give no recommendations on handling and storage of soil materials for (eco-)toxicological investigations. The objective of this investigation was to determine if microbial turnovers and losses of volatile organic compounds mainly cause storage-dependent changes in soil samples. Furthermore, recommendations are given for storage of soil reserve samples for toxicological investigations. During 18 months of storage, the microbial respiration of six highly contaminated soil samples was determined. Physicochemical characteristics, such as contaminant and nutrient content, were analyzed before and after storage. From the investigations it can be concluded that the oxygen consumption depends on the storage temperature, organic matter content, nutrient content, and total content of toxic substances. Based on the results, a flow scheme was derived that could be a useful tool for a sequential approach to determine the storage capacity of contaminated soil samples and sites for toxicological investigations.  相似文献   

15.
利用相关性分析鉴定与水稻根部性状表达相关的分子标记   总被引:15,自引:1,他引:15  
徐吉臣  邹亮星 《遗传学报》2002,29(3):245-249
84个水稻品种在营养液中生长,10天后测定每一品种的最大根长(Maximum Root Length,MRL)和根干重(Root Dry Weight,RDW)。选取其中有代表性的27个水稻品种,用扩增片段长度多态性(Amplified Fragment Length Polymor-phism,AFLP)技术进行基因组差异分析,通过计算差异带与性状表现间的相关系数,筛选与苗期水稻最大根长和根干重显著相关的分子标记,经过15对AFLP引物的筛选,有4对引物的7个片段的基因型表现与最大根长或(和)根干重显著相关,对其中的片段之一“T3P3f”进行克隆,测序后,设计特异PCR扩增引物“Z336”,进一步对84个水稻品种进行鉴定,统计分析后发现,Z2336与最大根长的相关系数为-0.193,相关性几近显著水平;与根干重的相关系数为-0.391,相关性达极显著水平,计算对根干重的差异解释率,可达15.3%,显示该标记与控制根干重性状表达的某个数量基因紧密连锁,它的存在对性状值的降低有显著的关系,进一步利用源于ZYQ8和JX17的加倍单倍体(double haploid,DH)分离群体进行基因定位,发现Z336位于水稻第11号染色体上,距离相邻的分子标记9.4cM。  相似文献   

16.
新城疫病毒HN基因的遗传变异与HI相关性的研究   总被引:8,自引:0,他引:8  
选取国内1999~2004年分离的新城疫野毒10株,经CEF蚀斑纯化和SPF鸡胚增殖,对其血凝素-神经氨酸酶(HN)基因分别进行克隆和测序,结合在GenBank中发表的LaSota、F48E9和Clone30等参考序列,对其氨基酸序列进行遗传变异分析,绘制系统发育树。利用SPF鸡在隔离器中分别制备上述NDV毒株的单因子阳性血清,进行血凝抑制(HI)交叉试验,计算NDV不同株之间的HI相关系数(r)。利用统计学软件SPSS8.0对NDV不同株之间的HN氨基酸同源率和HI相关系数(r)进行相关比较。结果表明:NDV野毒间氨基酸高度同源,同源性为96.5%~99.8%,而与LaSota、F48E9和Clone30同源率仅为87.4%~89.9%;所有野毒均缺乏1个潜在的糖基化位点;HN基因的氨基酸同源性与HI相关系数显著相关(P<0.01,r=0.55)。  相似文献   

17.
The correlation coefficient is commonly used as a measure of the divergence of gene expression profiles between different species. Here we point out a potential problem with this statistic: if measurement error is large relative to the differences in expression, the correlation coefficient will tend to show high divergence for genes that have relatively uniform levels of expression across tissues or time points. We show that genes with a conserved uniform pattern of expression have significantly higher levels of expression divergence, when measured using the correlation coefficient, than other genes, in a data set from mouse, rat, and human. We also show that the Euclidean distance yields low estimates of expression divergence for genes with a conserved uniform pattern of expression.IT is now possible to measure the expression levels of thousands of genes in multiple tissues at multiple times. This has led to investigations into the evolution of gene expression and how the pattern of expression changes on a genomic scale. In some analyses, the evolution of expression is considered only within one tissue, but in many studies the evolution across multiple tissues is investigated. In this latter case, the evolution of an expression profile—a vector of expression levels of a gene across several tissues—is considered.Several different statistics have been proposed to measure the divergence between gene expression profiles. The two most popular measures are the Euclidean distance (Jordan et al. 2005; Kim et al. 2006; Yanai et al. 2006; Urrutia et al. 2008) and Pearson''s correlation coefficient (Makova and Li 2003; Huminiecki and Wolfe 2004; Yang et al. 2005; Kim et al. 2006; Liao and Zhang 2006a,b; Xing et al. 2007; Urrutia et al. 2008). The correlation coefficient is often subtracted from one, so that the statistic varies from zero, when there has been no expression divergence, to a maximum of two; we refer to this statistic as the Pearson distance. Here we describe a significant shortcoming of the Pearson distance that is not shared by the Euclidean distance.To investigate properties of these two measures of expression divergence, we compiled a data set of 2859 orthologous genes from human, mouse, and rat for which we had microarray expression data from nine homologous tissues: bone marrow, heart, kidney, large intestine, pituitary, skeletal muscle, small intestine, spleen, and thymus). The expression data for rat came from Walker et al. (2004), the mouse data from Su et al. (2004), and the human data from Ge et al. (2005). Each tissue experiment had two replicates in mouse, a varying number of replicates in rat, and one in humans; some genes were also matched by multiple probe sets. To obtain an average across experiments and probe sets we processed the data as follows:
  1. Raw CEL files of gene expression levels were obtained from the NCBI Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/projects/geo/).
  2. The results from the mouse, rat, and human arrays were normalized separately using both the MAS5 (Affymetrix 2001) and the RMA algorithms (Irizarry et al. 2003) as implemented in Bioconductor (Gentleman et al. 2004). The results are qualitatively similar for the two normalization procedures, although recent analyses suggest that MAS5 normalization is generally better (Ploner et al. 2005; Lim et al. 2007).
  3. The expression of each gene within a tissue was averaged across experiments and probe sets.
We computed expression distances (ED) between orthologous gene expression profiles, for each of the three species comparisons, rat–mouse, rat–human, and mouse–human, according to the two different distance metrics, the Euclidean distance and the Pearson distance:(1)Here xij is the expression level of the gene under consideration in species i in tissue j, and is the average expression level of the gene in species i across tissues. Expression levels are known in a total of k tissues.Because expression levels are measured on different microarray platforms in the three species, we compute relative abundance (RA) values, before calculating the Euclidean distance (Liao and Zhang 2006a). The RA is the expression of a gene in a particular tissue divided by the sum of the expression values of that gene across all tissues. We calculated RA values to remove “probe” effects (the tendency for a gene to bind its probe set on one platform more efficiently than on another platform). Because of probe effects it is not easy to distinguish absolute changes in expression and differences in binding efficiency. Calculating RA values removes this problem from the Euclidean distance. Pearson''s distance does not change under such a rescaling and so this is unnecessary.In some analyses the logarithm of the expression or RA values are used (e.g., Makova and Li 2003; Kim et al. 2006; Xing et al. 2007), and in others the expression values are used without this transformation (e.g., Huminiecki and Wolfe 2004; Jordan et al. 2005; Yang et al. 2005; Liao and Zhang 2006a,b; Yanai et al. 2006; Urrutia et al. 2008). We calculated both the Pearson and the Euclidean distances on log-transformed and untransformed expression values. The results are qualitatively similar so here we present only the results obtained using the logarithm of the expression or RA values.It is natural to expect the two measures of expression divergence to be positively correlated with one another; however, the Euclidean and Pearson distances are almost completely uncorrelated (MAS5 normalization, mouse–rat correlation coefficient = 0.06, human–rat r = 0.13, human–mouse r = 0.10; RMA normalization, mouse–rat correlation coefficient = −0.12, human–rat r = −0.00, human–mouse r = −0.08; Figure 1). This could, plausibly, be because the two statistics measure different aspects of divergence. However, irrespective of this, there is a potential problem associated with the Pearson distance. Imagine that we have a gene that is expressed at identical levels in all tissues in two species (i.e., expression levels are uniform between tissues and also between species). We quite reasonably assume that measured expression levels contain noise. Thus each measured expression level (xij) is the sum of the (assumed) uniform expression level and an independent random number representing noise. In this case there is no real divergence in the expression profile between the species. However, the two measures of divergence may differ greatly in this case. The Euclidean distance reflects only the noise present in the data and hence will be small if the noise is small. By contrast, the Pearson distance will have a value close to 1 since the second term in PeaD in Equation 1 will be close to zero, reflecting the fact that the noise components of different expression levels are independent. Thus the Pearson distance will give the impression that expression divergence is great, but all this apparent divergence is noise. This will be a problem with Pearson''s distance whenever measurement error is of the same magnitude as the differences in expression between tissues. This will therefore tend to be a problem for lowly expressed genes, where measurement error can be large relative to the true value.Open in a separate windowFigure 1.—The correlation between the Euclidean and Pearson distances for (a) mouse–rat, (b) human–rat, and (c) human–mouse. Only the results from MAS5 normalization are shown; qualitatively similar results were obtained with RMA.The above example is unrealistic because real gene expression profiles are rarely perfectly uniform. To investigate whether this shortcoming of the Pearson distance is a problem in real data sets, we determined genes with a relatively uniform pattern of expression in all three species considered above. To do this we computed the entropy of a gene''s expression, which is a measure of uniformity in expression across tissues (Schug et al. 2005): the higher the value of the entropy, the more uniform is the expression. We calculated the entropy for each gene in each of the three species, averaged these across species, and then took those genes in the upper quartile of mean entropy values as a data set of genes with a relatively conserved pattern of uniform expression.It is natural to expect those genes with a conserved uniform pattern of expression to have relatively low expression divergence; however, on average these genes have significantly higher Pearson distances than other genes (Figure 2; supporting information, Figure S1 and Figure S2). By contrast, the Euclidean distance shows the pattern one would anticipate; all of the conserved uniform genes have low expression divergence. It therefore seems likely that the Pearson distance is sensitive to measurement error and hence may not be a good measure of expression divergence.Open in a separate windowFigure 2.—The distribution of expression divergence values for those genes with a uniform pattern of expression that is conserved across species vs. the distribution for all genes for (a) Pearson and (b) Euclidean distances for mouse–rat. We present similar values for human–mouse and human–rat in Figure S1 and Figure S2. Only the results from MAS5 normalization are shown; qualitatively similar results were obtained with RMA.

TABLE 1

The median expression divergence for genes that have a conserved uniform pattern of expression (upper quartile of mean entropy values) vs. all other genes
Data setStatisticConserved uniform genesOther genesWilcoxon test P-value
MAS5 normalization
    Mouse–ratEuclidean1.662.79<10−15
Pearson0.700.47<10−15
    Human–mouseEuclidean1.673.13<10−15
Pearson0.780.58<10−15
    Human–ratEuclidean1.833.21<10−15
Pearson0.780.58<10−15
RMA normalization
    Mouse–ratEuclidean0.591.40<10−15
Pearson0.820.38<10−15
    Human–mouseEuclidean0.591.58<10−15
Pearson0.810.48<10−15
    Human–ratEuclidean0.581.55<10−15

Pearson
0.73
0.50
<10−15
Open in a separate windowWe note that there are two additional advantages of the Euclidean distance. First, it can take into account differences in the absolute level of expression if those data are available, either because the method of assay allows this, for example, if ESTs, SAGE, sequencing, or RNA-Seq data are used, or because expression in the two species has been assessed on the same platform using probes that are conserved between the two species. Second, the square of the Euclidean distance is expected to increase linearly with time. Khaitovich et al. (2004) have previously shown that the squared difference in log expression level increases linearly with time under a Brownian motion model of gene expression evolution. It is therefore expected that the squared Euclidean distance will increase with time since the squared Euclidean distance is the sum of the squared differences across tissues. We prove this in File S1; we also show that this linearity holds, approximately, when relative abundance values are used (see also Pereira et al. 2009).  相似文献   

18.
Properties of surface plasmon polaritons (SPPs) excited by radially polarized sinh Gaussian beams with high-numerical-aperture system is investigated theoretically based on vector diffraction theory. It is observed that by properly tuning the beam waist size (w 0 ) and beam order (m) of the incident sinh Gaussian beam, one can achieve higher confinement in axial and lateral size of the generated plasmonic focal spot. We observed that sinh Gaussian beam of larger w 0 and m results in generation of highly confined plasmonic focal spot.  相似文献   

19.
20.
In this work, we study a several species aerobic chemostat model with constant recycle sludge concentration in continuous culture. We reduce the number of parameters by considering a dimensionless model. First, the existence of a global positive uniform attractor for the model with different removal rates is proved using the theory of dissipative dynamical systems. Hence, we investigate the asymptotic behavior of the model under small perturbations using methods of singular perturbation theory and we prove that, in the case of two species in competition, the unique equilibrium which is positive is globally asymptotically stable. Finally, we establish the link between the open problem of the chemostat with different removal rates and monotone functional responses, and our model when two species compete on the same nutrient. We give some numerical simulations to illustrate the results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号