首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We introduce a novel approach for describing patterns of HIV genetic variation using regression modeling techniques. Parameters are defined for describing genetic variation within and between viral populations by generalizing Simpson's index of diversity. Regression models are specified for these variation parameters and the generalized estimating equation framework is used for estimating both the regression parameters and their corresponding variances. Conditions are described under which the usual asymptotic approximations to the distribution of the estimators are met. This approach provides a formal statistical framework for testing hypotheses regarding the changing patterns of HIV genetic variation over time within an infected patient. The application of these methods for testing biologically relevant hypotheses concerning HIV genetic variation is demonstrated in an example using sequence data from a subset of patients from the Multicenter AIDS Cohort Study.  相似文献   

2.
多重线性回归模型协方差阵扰动的影响分析   总被引:2,自引:0,他引:2  
讨论多重线性回归模型协方差阵扰动的影响分析,获得了⌒/B与⌒/B(G)的一些关系式,⌒/B是原模型参数阵B的最佳莼性无偏估计(BLUE),⌒/B(G)是协方差阵扰动后的模型参数阵B的BLUE;文章给出了度量影响大小的测度DG及其多种形式;最后的实例说明了DG在影响分析时的有效性。  相似文献   

3.
4.
In the generalized method of moments approach to longitudinaldata analysis, unbiased estimating functions can be constructedto incorporate both the marginal mean and the correlation structureof the data. Increasing the number of parameters in the correlationstructure corresponds to increasing the number of estimatingfunctions. Thus, building a correlation model is equivalentto selecting estimating functions. This paper proposes a chi-squaredtest to choose informative unbiased estimating functions. Weshow that this methodology is useful for identifying which sourceof correlation it is important to incorporate when there aremultiple possible sources of correlation. This method can alsobe applied to determine the optimal working correlation forthe generalized estimating equation approach.  相似文献   

5.
6.
L. Xue  L. Wang  A. Qu 《Biometrics》2010,66(2):393-404
Summary We propose a new estimation method for multivariate failure time data using the quadratic inference function (QIF) approach. The proposed method efficiently incorporates within‐cluster correlations. Therefore, it is more efficient than those that ignore within‐cluster correlation. Furthermore, the proposed method is easy to implement. Unlike the weighted estimating equations in Cai and Prentice (1995, Biometrika 82 , 151–164), it is not necessary to explicitly estimate the correlation parameters. This simplification is particularly useful in analyzing data with large cluster size where it is difficult to estimate intracluster correlation. Under certain regularity conditions, we show the consistency and asymptotic normality of the proposed QIF estimators. A chi‐squared test is also developed for hypothesis testing. We conduct extensive Monte Carlo simulation studies to assess the finite sample performance of the proposed methods. We also illustrate the proposed methods by analyzing primary biliary cirrhosis (PBC) data.  相似文献   

7.
As the molecular marker density grows, there is a strong need in both genome-wide association studies and genomic selection to fit models with a large number of parameters. Here we present a computationally efficient generalized ridge regression (RR) algorithm for situations in which the number of parameters largely exceeds the number of observations. The computationally demanding parts of the method depend mainly on the number of observations and not the number of parameters. The algorithm was implemented in the R package bigRR based on the previously developed package hglm. Using such an approach, a heteroscedastic effects model (HEM) was also developed, implemented, and tested. The efficiency for different data sizes were evaluated via simulation. The method was tested for a bacteria-hypersensitive trait in a publicly available Arabidopsis data set including 84 inbred lines and 216,130 SNPs. The computation of all the SNP effects required <10 sec using a single 2.7-GHz core. The advantage in run time makes permutation test feasible for such a whole-genome model, so that a genome-wide significance threshold can be obtained. HEM was found to be more robust than ordinary RR (a.k.a. SNP-best linear unbiased prediction) in terms of QTL mapping, because SNP-specific shrinkage was applied instead of a common shrinkage. The proposed algorithm was also assessed for genomic evaluation and was shown to give better predictions than ordinary RR.  相似文献   

8.
Recent genome-wide association studies identified genetic variants that confer susceptibility to type 2 diabetes mellitus (T2DM). However, few longitudinal genome-wide association studies of this metabolic disorder have been reported to date. Therefore, we performed a longitudinal exome-wide association study of T2DM, using 24,579 single nucleotide polymorphisms (SNPs) and repeated measurements from 6022 Japanese individuals. The generalized estimating equation model was applied to test relations of SNPs to three T2DM-related parameters: prevalence of T2DM, fasting plasma glucose level, and blood glycosylated hemoglobin content. Three SNPs that passed quality control were significantly (P < 2.26 × 10? 7) associated with two of the three T2DM-related parameters in additive and recessive models. Of the three SNPs, rs6414624 in EVC and rs78338345 in GGA3 were novel susceptibility loci for T2DM. In the present study, the SNP of GGA3 was predicted to be a genetic variant whose minor allele frequency has recently increased in East Asia.  相似文献   

9.
10.
选择回归方程自变量的条件数法及其在RK手术中的应用   总被引:1,自引:1,他引:1  
选择合适的自变量是确定线性回归模型的首要问题,本文以消除自变量之间的复共线性为目标,介绍了一种选择回归方程自变量的条件数法,并在RK手术的结果预测问题中采用了这一方法。  相似文献   

11.
We describe an algorithm based upon the Sherman–Morrison–Woodburyformula for the inversion of matrices with special structurethat occur in formulae for deletion diagnostics. Substantialcomputational savings relative to a method based upon Cholesky'sdecomposition are illustrated. The result has broad applicationto regression diagnostics for clustered data.  相似文献   

12.
13.
Yuanjia Wang  Huaihou Chen 《Biometrics》2012,68(4):1113-1125
Summary We examine a generalized F ‐test of a nonparametric function through penalized splines and a linear mixed effects model representation. With a mixed effects model representation of penalized splines, we imbed the test of an unspecified function into a test of some fixed effects and a variance component in a linear mixed effects model with nuisance variance components under the null. The procedure can be used to test a nonparametric function or varying‐coefficient with clustered data, compare two spline functions, test the significance of an unspecified function in an additive model with multiple components, and test a row or a column effect in a two‐way analysis of variance model. Through a spectral decomposition of the residual sum of squares, we provide a fast algorithm for computing the null distribution of the test, which significantly improves the computational efficiency over bootstrap. The spectral representation reveals a connection between the likelihood ratio test (LRT) in a multiple variance components model and a single component model. We examine our methods through simulations, where we show that the power of the generalized F ‐test may be higher than the LRT, depending on the hypothesis of interest and the true model under the alternative. We apply these methods to compute the genome‐wide critical value and p ‐value of a genetic association test in a genome‐wide association study (GWAS), where the usual bootstrap is computationally intensive (up to 108 simulations) and asymptotic approximation may be unreliable and conservative.  相似文献   

14.
We propose a mixed-effect linear model, as a particular case of the two-level regression model, for analyzing repeated measures made at completely irregular time points. The model allows for subject-level covariates, so as to study the trend and the variability of the individual growth curves. Application of this model is illustrated on a published data set.  相似文献   

15.
Summary A two‐stage design is cost‐effective for genome‐wide association studies (GWAS) testing hundreds of thousands of single nucleotide polymorphisms (SNPs). In this design, each SNP is genotyped in stage 1 using a fraction of case–control samples. Top‐ranked SNPs are selected and genotyped in stage 2 using additional samples. A joint analysis, combining statistics from both stages, is applied in the second stage. Follow‐up studies can be regarded as a two‐stage design. Once some potential SNPs are identified, independent samples are further genotyped and analyzed separately or jointly with previous data to confirm the findings. When the underlying genetic model is known, an asymptotically optimal trend test (TT) can be used at each analysis. In practice, however, genetic models for SNPs with true associations are usually unknown. In this case, the existing methods for analysis of the two‐stage design and follow‐up studies are not robust across different genetic models. We propose a simple robust procedure with genetic model selection to the two‐stage GWAS. Our results show that, if the optimal TT has about 80% power when the genetic model is known, then the existing methods for analysis of the two‐stage design have minimum powers about 20% across the four common genetic models (when the true model is unknown), while our robust procedure has minimum powers about 70% across the same genetic models. The results can be also applied to follow‐up and replication studies with a joint analysis.  相似文献   

16.
Model checking for ROC regression analysis   总被引:1,自引:0,他引:1  
Cai T  Zheng Y 《Biometrics》2007,63(1):152-163
Summary .   The receiver operating characteristic (ROC) curve is a prominent tool for characterizing the accuracy of a continuous diagnostic test. To account for factors that might influence the test accuracy, various ROC regression methods have been proposed. However, as in any regression analysis, when the assumed models do not fit the data well, these methods may render invalid and misleading results. To date, practical model-checking techniques suitable for validating existing ROC regression models are not yet available. In this article, we develop cumulative residual-based procedures to graphically and numerically assess the goodness of fit for some commonly used ROC regression models, and show how specific components of these models can be examined within this framework. We derive asymptotic null distributions for the residual processes and discuss resampling procedures to approximate these distributions in practice. We illustrate our methods with a dataset from the cystic fibrosis registry.  相似文献   

17.
Model selection for multiperiod forecasts   总被引:1,自引:0,他引:1  
LIU  SHU-ING 《Biometrika》1996,83(4):861-873
  相似文献   

18.
Bioinformatics and re-sequencing approaches were used for the discovery of sequence polymorphisms in Litopenaeus vannamei . A total of 1221 putative single nucleotide polymorphisms (SNPs) were identified in a pool of individuals from various commercial populations. A set of 211 SNPs were selected for further molecular validation and 88% showed variation in 637 samples representing three commercial breeding lines. An association analysis was performed between these markers and several traits of economic importance for shrimp producers including resistance to three major viral diseases. A small number of SNPs showed associations with test weekly gain, grow-out survival and resistance to Taura Syndrome Virus. Very low levels of linkage disequilibrium were revealed between most SNP pairs, with only 11% of SNPs showing an r 2-value above 0.10 with at least one other SNP. Comparison of allele frequencies showed small changes over three generations of the breeding programme in one of the commercial breeding populations. This unique SNP resource has the potential to catalyse future studies of genetic dissection of complex traits, tracing relationships in breeding programmes, and monitoring genetic diversity in commercial and wild populations of L. vannamei .  相似文献   

19.
In this study, we are interested in the problem of estimating the parameters in a nonlinear regression model when the error terms are correlated. Throughout this work, we restrict ourselves to the special case when the error terms follow a pth order stationary autoregressive model (AR(p)). Following the idea of LAWTON and SYLVESTRE (1971) and GALLANT and GOEBEL (1976), a parameter-elimination method is proposed, which has the advantages that it is not sensitive to the initial values and convergence of the procedure may be more stable because of the reduced dimension of the problem. The parameter-elimination method is compared with the methods by GALLANT and GOEBEL (1976) and GLASBEY (1980) by Monte Carlo Simulation, and the results of applying the first two methods to the real data obtained from the Environmental Protection Administration of the Executive Yuan of the Republic of China are presented.  相似文献   

20.
Background In genetic association studies with quantitative trait loci (QTL), the association between a candidate genetic marker and the trait of interest is commonly examined by the omnibus F test or by the t-test corresponding to a given genetic model or mode of inheritance. It is known that the t-test with a correct model specification is more powerful than the F test. However, since the underlying genetic model is rarely known in practice, the use of a model-specific t-test may incur substantial power loss. Robust-efficient tests, such as the Maximin Efficiency Robust Test (MERT) and MAX3 have been proposed in the literature.Methods In this paper, we propose a novel two-step robust-efficient approach, namely, the genetic model selection (GMS) method for quantitative trait analysis. GMS selects a genetic model by testing Hardy-Weinberg disequilibrium (HWD) with extremal samples of the population in the first step and then applies the corresponding genetic model-specific t-test in the second step.Results Simulations show that GMS is not only more efficient than MERT and MAX3, but also has comparable power to the optimal t-test when the genetic model is known.Conclusion Application to the data from Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort demonstrates that the proposed approach can identify meaningful biological SNPs on chromosome 19.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号