首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We welcome Dr Thorpe's interesting discussion (Thorpe, 1988), and we would like to take this opportunity to clarify some points.
Both MGPCA (multiple group principal component analysis) and CPCA (common principal component analysis) serve essentially the same purpose, namely estimation of principal components simultaneously in several groups, based on the assumption of equality of principal component directions across groups, while eigenvalues may differ between groups. However, CPCA has the distinct advantage that this assumption can actually be tested, using the (CPC) statistic. In analyses involving more than two variables, it is usually difficult to decide, without a formal test, whether or not the assumption of common directions of principal components is reasonable.
There is also a conceptual difficulty with MGPCA. In statistical terms, both methods assume that:
(a) a certain set of parameters (namely those determining the eigenvectors) are common to all groups
(b) there are sets of parameters (namely p eigenvalues per group) which are specific to each group.
CPCA sets up a model that reflects this structure and estimates the parameters accordingly. MGPCA, on the other hand, ignores part (b), at least temporarily, by pooling the variance-covariance matrices and extracting eigenvectors from the single pooled matrix. This may lead to reasonable results, but there is no guarantee that it will indeed do so. The reader may find a more familiar analog in the fitting of regression lines when data are in groups. If it is assumed that all regression lines are parallel, one should set up an appropriate model based on a single slope parameter common to all groups, and groupwise intercepts. One should then estimate the parameters of this model, and not simply apply a technique which is appropriate in the one-group case only.  相似文献   

2.
In this paper, we propose a simple parametric modal linear regression model where the response variable is gamma distributed using a new parameterization of this distribution that is indexed by mode and precision parameters, that is, in this new regression model, the modal and precision responses are related to a linear predictor through a link function and the linear predictor involves covariates and unknown regression parameters. The main advantage of our new parameterization is the straightforward interpretation of the regression coefficients in terms of the mode of the positive response variable, as is usual in the context of generalized linear models, and direct inference in parametric mode regression based on the likelihood paradigm. Furthermore, we discuss residuals and influence diagnostic tools. A Monte Carlo experiment is conducted to evaluate the performances of these estimators in finite samples with a discussion of the results. Finally, we illustrate the usefulness of the new model by two applications, to biology and demography.  相似文献   

3.
Maity A  Lin X 《Biometrics》2011,67(4):1271-1284
We propose in this article a powerful testing procedure for detecting a gene effect on a continuous outcome in the presence of possible gene-gene interactions (epistasis) in a gene set, e.g., a genetic pathway or network. Traditional tests for this purpose require a large number of degrees of freedom by testing the main effect and all the corresponding interactions under a parametric assumption, and hence suffer from low power. In this article, we propose a powerful kernel machine based test. Specifically, our test is based on a garrote kernel method and is constructed as a score test. Here, the term garrote refers to an extra nonnegative parameter that is multiplied to the covariate of interest so that our score test can be formulated in terms of this nonnegative parameter. A key feature of the proposed test is that it is flexible and developed for both parametric and nonparametric models within a unified framework, and is more powerful than the standard test by accounting for the correlation among genes and hence often uses a much smaller degrees of freedom. We investigate the theoretical properties of the proposed test. We evaluate its finite sample performance using simulation studies, and apply the method to the Michigan prostate cancer gene expression data.  相似文献   

4.
Population multiple components is a statistical tool useful for the analysis of time-dependent hybrid data. With a small number of parameters, it is possible to model and to predict the periodic behavior of a population. In this article, we propose two methods to compare among populations rhythmometric parameters obtained by multiple component analysis. The first is a parametric method based in the usual statistical techniques for comparison of mean vectors in multivariate normal populations. The method, through MANOVA analysis, allows comparison of the MESOR and amplitude-acrophase pair of each component among two or more populations. The second is a nonparametric method, based in bootstrap techniques, to compare parameters from two populations. This test allows one to compare the MESOR, the amplitude, and the acrophase of each fitted component, as well as the global amplitude, orthophase, and bathyphase estimated when all fitted components are harmonics of a fundamental period. The idea is to calculate a confidence interval for the difference of the parameters of interest. If this interval does not contain zero, it can be concluded that the parameters from the two models are different with high probability. An estimation of p-value for the corresponding test can also be calculated. Both methods are illustrated with an example, based on clinical data. The nonparametric test can also be applied to paired data, a special situation of great interest in practice. By the use of similar bootstrap techniques, we illustrate how to construct confidence intervals for any rhythmometric parameter estimated from population multiple components models, including the orthophase, bathyphase, and global amplitude. These tests for comparison of parameters among populations are a needed tool when modeling the nonsinusoidal rhythmic behavior of hybrid data by population multiple component analysis.  相似文献   

5.
Population multiple components is a statistical tool useful for the analysis of time-dependent hybrid data. With a small number of parameters, it is possible to model and to predict the periodic behavior of a population. In this article, we propose two methods to compare among populations rhythmometric parameters obtained by multiple component analysis. The first is a parametric method based in the usual statistical techniques for comparison of mean vectors in multivariate normal populations. The method, through MANOVA analysis, allows comparison of the MESOR and amplitude-acrophase pair of each component among two or more populations. The second is a nonparametric method, based in bootstrap techniques, to compare parameters from two populations. This test allows one to compare the MESOR, the amplitude, and the acrophase of each fitted component, as well as the global amplitude, orthophase, and bathyphase estimated when all fitted components are harmonics of a fundamental period. The idea is to calculate a confidence interval for the difference of the parameters of interest. If this interval does not contain zero, it can be concluded that the parameters from the two models are different with high probability. An estimation of p-value for the corresponding test can also be calculated. Both methods are illustrated with an example, based on clinical data. The nonparametric test can also be applied to paired data, a special situation of great interest in practice. By the use of similar bootstrap techniques, we illustrate how to construct confidence intervals for any rhythmometric parameter estimated from population multiple components models, including the orthophase, bathyphase, and global amplitude. These tests for comparison of parameters among populations are a needed tool when modeling the nonsinusoidal rhythmic behavior of hybrid data by population multiple component analysis.  相似文献   

6.
In studies that involve multivariate outcomes it is often of interest to test for a common exposure effect. For example, our research is motivated by a study of neurocognitive performance in a cohort of HIV-infected women. The goal is to determine whether highly active antiretroviral therapy affects different aspects of neurocognitive functioning to the same degree and if so, to test for the treatment effect using a more powerful one-degree-of-freedom global test. Since multivariate continuous outcomes are likely to be measured on different scales, such a common exposure effect has not been well defined. We propose the use of a scaled marginal model for testing and estimating this global effect when the outcomes are all continuous. A key feature of the model is that the effect of exposure is represented by a common effect size and hence has a well-understood, practical interpretation. Estimating equations are proposed to estimate the regression coefficients and the outcome-specific scale parameters, where the correct specification of the within-subject correlation is not required. These estimating equations can be solved by repeatedly calling standard generalized estimating equations software such as SAS PROC GENMOD. To test whether the assumption of a common exposure effect is reasonable, we propose the use of an estimating-equation-based score-type test. We study the asymptotic efficiency loss of the proposed estimators, and show that they generally have high efficiency compared to the maximum likelihood estimators. The proposed method is applied to the HIV data.  相似文献   

7.
In studies of human balance, it is common to fit stimulus-response data by tuning the time-delay and gain parameters of a simple delayed feedback model. Many interpret this fitted model, a simple delayed feedback model, as evidence that predictive processes are not required to explain existing data on standing balance. However, two questions lead us to doubt this approach. First, does fitting a delayed feedback model lead to reliable estimates of the time-delay? Second, can a non-predictive controller provide an explanation compatible with the independently estimated time delay? For methodological and experimental clarity, we study human balancing of a simulated inverted pendulum via joystick and screen. A two-step approach to data analysis is used: firstly a non-parametric model—the closed-loop impulse response—is estimated from the experimental data; second, a parametric model is fitted to the non-parametric impulse-response by adjusting time-delay and controller parameters. To support the second step, a new explicit formula relating controller parameters to closed-loop impulse response is derived. Two classes of controller are investigated within a common state-space context: non-predictive and predictive. It is found that the time-delay estimate arising from the second step is strongly dependent on which controller class is assumed; in particular, the non-predictive control assumption leads to time-delay estimates that are smaller than those arising from the predictive assumption. Moreover, the time-delays estimated using the non-predictive control assumption are not consistent with a lower-bound on the time-delay of the non-parametric model whereas the corresponding predictive result is consistent. Thus while the goodness of fit only marginally favoured predictive over non-predictive control, if we add the additional constraint that the model must reproduce the non-parametric time delay, then the non-predictive control model fails. We conclude (1) the time-delay should be estimated independently of fitting a low order parametric model, (2) that balance of the simulated inverted pendulum could not be explained by the non-predictive control model and (3) that predictive control provided a better explanation than non-predictive control.  相似文献   

8.
When the mode of inheritance of a disease is unknown, the LOD-score method of linkage analysis must take into account uncertainties in model parameters. We have previously proposed a parametric linkage test called "MFLOD," which does not require specification of disease model parameters. In the present study, we introduce two new model-free parametric linkage tests, known as "MLOD" and "MALOD." These tests are defined, respectively, as the LOD score and the admixture LOD score, maximized (subject to the same constraints as MFLOD) over disease-model parameters. We compared the power of these three parametric linkage tests and that of two nonparametric linkage tests, NPLall and NPLpairs, which are implemented in GENEHUNTER. With the use of small pedigrees and a fully informative marker, we found the powers of MLOD, NPLall, and NPLpairs to be almost equivalent to each other and not far below that of a LOD-score analysis performed under the assumption the correct genetic parameters. Thus, linkage analysis is not much hindered by uncertain mode of inheritance. The results also suggest that both parametric and nonparametric methods are suitable for linkage analysis of complex disorders in small pedigrees. However, whether these results apply to large pedigrees remains to be answered.  相似文献   

9.
Peng Y  Dear KB 《Biometrics》2000,56(1):237-243
Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The EM algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.  相似文献   

10.
In clinical trials examining the incidence of pneumonia it is a common practice to measure infection via both invasive and non-invasive procedures. In the context of a recently completed randomized trial comparing two treatments the invasive procedure was only utilized in certain scenarios due to the added risk involved, and given that the level of the non-invasive procedure surpassed a given threshold. Hence, what was observed was bivariate data with a pattern of missingness in the invasive variable dependent upon the value of the observed non-invasive observation within a given pair. In order to compare two treatments with bivariate observed data exhibiting this pattern of missingness we developed a semi-parametric methodology utilizing the density-based empirical likelihood approach in order to provide a non-parametric approximation to Neyman-Pearson-type test statistics. This novel empirical likelihood approach has both a parametric and non-parametric components. The non-parametric component utilizes the observations for the non-missing cases, while the parametric component is utilized to tackle the case where observations are missing with respect to the invasive variable. The method is illustrated through its application to the actual data obtained in the pneumonia study and is shown to be an efficient and practical method.  相似文献   

11.
The confirmatory analysis of pre-specified multiple hypotheses has become common in pivotal clinical trials. In the recent past multiple test procedures have been developed that reflect the relative importance of different study objectives, such as fixed sequence, fallback, and gatekeeping procedures. In addition, graphical approaches have been proposed that facilitate the visualization and communication of Bonferroni-based closed test procedures for common multiple test problems, such as comparing several treatments with a control, assessing the benefit of a new drug for more than one endpoint, combined non-inferiority and superiority testing, or testing a treatment at different dose levels in an overall and a subpopulation. In this paper, we focus on extended graphical approaches by dissociating the underlying weighting strategy from the employed test procedure. This allows one to first derive suitable weighting strategies that reflect the given study objectives and subsequently apply appropriate test procedures, such as weighted Bonferroni tests, weighted parametric tests accounting for the correlation between the test statistics, or weighted Simes tests. We illustrate the extended graphical approaches with several examples. In addition, we describe briefly the gMCP package in R, which implements some of the methods described in this paper.  相似文献   

12.
Since the seminal work of Prentice and Pyke, the prospective logistic likelihood has become the standard method of analysis for retrospectively collected case‐control data, in particular for testing the association between a single genetic marker and a disease outcome in genetic case‐control studies. In the study of multiple genetic markers with relatively small effects, especially those with rare variants, various aggregated approaches based on the same prospective likelihood have been developed to integrate subtle association evidence among all the markers considered. Many of the commonly used tests are derived from the prospective likelihood under a common‐random‐effect assumption, which assumes a common random effect for all subjects. We develop the locally most powerful aggregation test based on the retrospective likelihood under an independent‐random‐effect assumption, which allows the genetic effect to vary among subjects. In contrast to the fact that disease prevalence information cannot be used to improve efficiency for the estimation of odds ratio parameters in logistic regression models, we show that it can be utilized to enhance the testing power in genetic association studies. Extensive simulations demonstrate the advantages of the proposed method over the existing ones. A real genome‐wide association study is analyzed for illustration.  相似文献   

13.
The analysis of multiple components is often used to modelbiological variables that show nonsinusoidal predictable changes of knownperiods. In general, to anticipate the periods is not easy, and even in caseswhen we have some a priori information, it is advisable to have a statisticaltool to test the chosen periods. In this work, we introduce a statisticalprocedure to estimate periods of longitudinal series by applying nonlinearregression techniques to the multiple sinusoidal model, as well as to thegeneral linear model. Approximate inferences about the parameters of the modelare carried out under the usual hypothesis of normality, independence, andconstant variance of the errors. Confidence intervals (CIs) for each individualparameter, as well as for the amplitude-acrophase pair or for any other subgroupof parameters of interest, can be computed. As in the linear analysis of multiplecomponents, it is possible to check the existence of rhythm by means of azero-amplitude test. The method also allows statistical testing of severalhypotheses related to the periods. For example, it is possible to test ifthe periods are equal to certain values of chronobiologic interest and tocheck if some components included in the model are harmonically related. Onthe other hand, when the fitted components have proximal periods, the methodallows one to verify if they are modeling the same or different spectral peaks.The method, which was validated by a simulation study for a model of two componentsand is illustrated by an example of modeling the diastolic blood pressureof two subjects, represents a new step in the development of statistical proceduresin chronobiology. (Chronobiology International, 18(2),285–308, 2001)  相似文献   

14.
We consider the statistical testing for non-inferiority of a new treatment compared with the standard one under matched-pair setting in a stratified study or in several trials. A non-inferiority test based on the efficient scores and a Mantel-Haenszel (M-H) like procedure with restricted maximum likelihood estimators (RMLEs) of nuisance parameters and their corresponding sample size formulae are presented. We evaluate the above tests and the M-H type Wald test in level and power. The stratified score test is conservative and provides the best power. The M-H like procedure with RMLEs gives an accurate level. However, the Wald test is anti-conservative and we suggest caution when it is used. The unstratified score test is not biased but it is less powerful than the stratified score test when base-line probabilities related to strata are not the same. This investigation shows that the stratified score test possesses optimum statistical properties in testing non-inferiority. A common difference between two proportions across strata is the basic assumption of the stratified tests, we present appropriate tests to validate the assumption and related remarks.  相似文献   

15.
In genetic research of chronic diseases, age-at-onset outcomes within families are often correlated. The nature of correlation of age-at-onset outcomes is indicative of common genetic and/or shared environmental risk factors among family members. Understanding patterns of such correlation may shed light on the disease etiology and, hence, is an important step to take prior to further searching for the responsible genes via segregation and linkage studies. Age-at-onset outcomes are different from those familiar quantitative or qualitative traits for which many statistical methods have been developed. In comparison with the quantitative traits, age-at-onset outcomes are often censored, i.e., instead of actual age-at-onset outcomes, only the current ages or ages at death are observed. They are also different from qualitative traits because of their continuity. Because of the complexity of correlated censored outcomes, few methods have yet been developed. A traditional approach is to impose a parametric joint distribution for the correlated age-at-onset outcomes, which has been criticized for requiring a stringent assumption about the entire distribution of age at onset. The purpose of this paper is to describe a method for assessing familial aggregation of correlated age-at-onset outcomes semiparametrically, by use of estimating equations. This method does not require any parametric assumption for modeling the age at onset. The estimates of parameters, including those quantifying the correlation within families, are consistent and have an asymptotic normal distribution that can be used to make inferences. To illustrate this new method, we analyzed two age-at-onset data sets that were obtained from studies conducted in the States of Washington and Hawaii, with the objective of quantifying the familial aggregations of age at onset of breast cancer.  相似文献   

16.
Multiple components linear least-squares methods have been proposed for the detection of periodic components in nonsinusoidal longitudinal time series. However, a proper test for comparison of parameters obtained from this method for two or more time series is not yet available. Accordingly, we propose two methods, one parametric and one nonparametric, to compare parameters from rhythmometric models with multiple components. The parametric method is based on techniques commonly and generally employed in linear regression analysis. The comparison of parameters among two or more time series is accomplished by the use of so-called dummy variables. The nonparametric method is based on bootstrap techniques. This approach basically tests if the difference in any given parameter obtained by fitting a model with the same periods to two different longitudinal time series differs from zero. This method calculates a confidence interval for the difference in the tested parameter. If this interval does not contain zero, it can be concluded that the parameters obtained from the two time series are different with high probability. An estimation of the p-value for the corresponding test can also be calculated. By the use of similar bootstrap techniques, confidence intervals can also be obtained for any parameter derived from the multiple component fit of several periods to nonsinusoidal longitudinal time series, including the orthophase (peak time), bathyphase (trough time), and global amplitude (difference between the maximum and the minimum) of the fitted model waveform. These methods represent a valuable tool for the comparison of rhythm parameters obtained by multiple component analysis, and they render this approach as a generally applicable one for waveform representation and detection of periodicities in nonsinusoidal, sparse, and noisy longitudinal time series sampled with either equidistant or unequidistant observations.  相似文献   

17.
Modeling Interference in Genetic Recombination   总被引:16,自引:8,他引:8  
M. S. McPeek  T. P. Speed 《Genetics》1995,139(2):1031-1044
In analyzing genetic linkage data it is common to assume that the locations of crossovers along a chromosome follow a Poisson process, whereas it has long been known that this assumption does not fit the data. In many organisms it appears that the presence of a crossover inhibits the formation of another nearby, a phenomenon known as ``interference.' We discuss several point process models for recombination that incorporate position interference but assume no chromatid interference. Using stochastic simulation, we are able to fit the models to a multilocus Drosophila dataset by the method of maximum likelihood. We find that some biologically inspired point process models incorporating one or two additional parameters provide a dramatically better fit to the data than the usual ``no-interference' Poisson model.  相似文献   

18.
Methods in the literature for missing covariate data in survival models have relied on the missing at random (MAR) assumption to render regression parameters identifiable. MAR means that missingness can depend on the observed exit time, and whether or not that exit is a failure or a censoring event. By considering ways in which missingness of covariate X could depend on the true but possibly censored failure time T and the true censoring time C, we attempt to identify missingness mechanisms which would yield MAR data. We find that, under various reasonable assumptions about how missingness might depend on T and/or C, additional strong assumptions are needed to obtain MAR. We conclude that MAR is difficult to justify in practical applications. One exception arises when missingness is independent of T, and C is independent of the value of the missing X. As alternatives to MAR, we propose two new missingness assumptions. In one, the missingness depends on T but not on C; in the other, the situation is reversed. For each, we show that the failure time model is identifiable. When missingness is independent of T, we show that the naive complete record analysis will yield a consistent estimator of the failure time distribution. When missingness is independent of C, we develop a complete record likelihood function and a corresponding estimator for parametric failure time models. We propose analyses to evaluate the plausibility of either assumption in a particular data set, and illustrate the ideas using data from the literature on this problem.  相似文献   

19.
BackgroundCure models can provide improved possibilities for inference if used appropriately, but there is potential for misleading results if care is not taken. In this study, we compared five commonly used approaches for modelling cure in a relative survival framework and provide some practical advice on the use of these approaches.Patients and methodsData for colon, female breast, and ovarian cancers were used to illustrate these approaches. The proportion cured was estimated for each of these three cancers within each of three age groups. We then graphically assessed the assumption of cure and the model fit, by comparing the predicted relative survival from the cure models to empirical life table estimates.ResultsWhere both cure and distributional assumptions are appropriate (e.g., for colon or ovarian cancer patients aged <75 years), all five approaches led to similar estimates of the proportion cured. The estimates varied slightly when cure was a reasonable assumption but the distributional assumption was not (e.g., for colon cancer patients ≥75 years). Greater variability in the estimates was observed when the cure assumption was not supported by the data (breast cancer).ConclusionsIf the data suggest cure is not a reasonable assumption then we advise against fitting cure models. In the scenarios where cure was reasonable, we found that flexible parametric cure models performed at least as well, or better, than the other modelling approaches. We recommend that, regardless of the model used, the underlying assumptions for cure and model fit should always be graphically assessed.  相似文献   

20.
The analysis of multiple components is often used to model biological variables that show nonsinusoidal predictable changes of known periods. In general, to anticipate the periods is not easy, and even in cases when we have some a priori information, it is advisable to have a statistical tool to test the chosen periods. In this work, we introduce a statistical procedure to estimate periods of longitudinal series by applying nonlinear regression techniques to the multiple sinusoidal model, as well as to the general linear model. Approximate inferences about the parameters of the model are carried out under the usual hypothesis of normality, independence, and constant variance of the errors. Confidence intervals (CIs) for each individual parameter, as well as for the amplitude-acrophase pair or for any other subgroup of parameters of interest, can be computed. As in the linear analysis of multiple components, it is possible to check the existence of rhythm by means of a zero-amplitude test. The method also allows statistical testing of several hypotheses related to the periods. For example, it is possible to test if the periods are equal to certain values of chronobiologic interest and to check if some components included in the model are harmonically related. On the other hand, when the fitted components have proximal periods, the method allows one to verify if they are modeling the same or different spectral peaks. The method, which was validated by a simulation study for a model of two components and is illustrated by an example of modeling the diastolic blood pressure of two subjects, represents a new step in the development of statistical procedures in chronobiology. (Chronobiology International, 18(2), 285-308, 2001)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号