首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Malka Gorfine  Li Hsu 《Biometrics》2011,67(2):415-426
Summary In this work, we provide a new class of frailty‐based competing risks models for clustered failure times data. This class is based on expanding the competing risks model of Prentice et al. (1978, Biometrics 34 , 541–554) to incorporate frailty variates, with the use of cause‐specific proportional hazards frailty models for all the causes. Parametric and nonparametric maximum likelihood estimators are proposed. The main advantages of the proposed class of models, in contrast to the existing models, are: (1) the inclusion of covariates; (2) the flexible structure of the dependency among the various types of failure times within a cluster; and (3) the unspecified within‐subject dependency structure. The proposed estimation procedures produce the most efficient parametric and semiparametric estimators and are easy to implement. Simulation studies show that the proposed methods perform very well in practical situations.  相似文献   

2.
Summary Combining data collected from different sources can potentially enhance statistical efficiency in estimating effects of environmental or genetic factors or gene–environment interactions. However, combining data across studies becomes complicated when data are collected under different study designs, such as family‐based and unrelated individual‐based case–control design. In this article, we describe likelihood‐based approaches that permit the joint estimation of covariate effects on disease risk under study designs that include cases, relatives of cases, and unrelated individuals. Our methods accommodate familial residual correlation and a variety of ascertainment schemes. Extensive simulation experiments demonstrate that the proposed methods for estimation and inference perform well in realistic settings. Efficiencies of different designs are contrasted in the simulation. We applied the methods to data from the Colorectal Cancer Family Registry.  相似文献   

3.
The Cochran–Armitage (CA) linear trend test for proportions is often used for genotype‐based analysis of candidate gene association. Depending on the underlying genetic mode of inheritance, the use of model‐specific scores maximises the power. Commonly, the underlying genetic model, i.e. additive, dominant or recessive mode of inheritance, is a priori unknown. Association studies are commonly analysed using permutation tests, where both inference and identification of the underlying mode of inheritance are important. Especially interesting are tests for case–control studies, defined by a maximum over a series of standardised CA tests, because such a procedure has power under all three genetic models. We reformulate the test problem and propose a conditional maximum test of scores‐specific linear‐by‐linear association tests. For maximum‐type, sum and quadratic test statistics the asymptotic expectation and covariance can be derived in a closed form and the limiting distribution is known. Both the limiting distribution and approximations of the exact conditional distribution can easily be computed using standard software packages. In addition to these technical advances, we extend the area of application to stratified designs, studies involving more than two groups and the simultaneous analysis of multiple loci by means of multiplicity‐adjusted p‐values for the underlying multiple CA trend tests. The new test is applied to reanalyse a study investigating genetic components of different subtypes of psoriasis. A new and flexible inference tool for association studies is available both theoretically as well as practically since already available software packages can be easily used to implement the suggested test procedures.  相似文献   

4.
Assuming that the independent variables (factors) are quantitative, there exist besides the coding schemes generally used for the multivariate analysis of variance (dummy-coded or effect-coded design matrices) the so-called polynomial models. The advantage of these polynomial models are the full rank design matrices, which allow a more comprehensible analysis, i.e. the unambiguous interpretation of tested hypotheses and simultaneous confidence intervals.  相似文献   

5.
Summary A two‐stage design is cost‐effective for genome‐wide association studies (GWAS) testing hundreds of thousands of single nucleotide polymorphisms (SNPs). In this design, each SNP is genotyped in stage 1 using a fraction of case–control samples. Top‐ranked SNPs are selected and genotyped in stage 2 using additional samples. A joint analysis, combining statistics from both stages, is applied in the second stage. Follow‐up studies can be regarded as a two‐stage design. Once some potential SNPs are identified, independent samples are further genotyped and analyzed separately or jointly with previous data to confirm the findings. When the underlying genetic model is known, an asymptotically optimal trend test (TT) can be used at each analysis. In practice, however, genetic models for SNPs with true associations are usually unknown. In this case, the existing methods for analysis of the two‐stage design and follow‐up studies are not robust across different genetic models. We propose a simple robust procedure with genetic model selection to the two‐stage GWAS. Our results show that, if the optimal TT has about 80% power when the genetic model is known, then the existing methods for analysis of the two‐stage design have minimum powers about 20% across the four common genetic models (when the true model is unknown), while our robust procedure has minimum powers about 70% across the same genetic models. The results can be also applied to follow‐up and replication studies with a joint analysis.  相似文献   

6.
In population‐based case‐control studies, it is of great public‐health importance to estimate the disease incidence rates associated with different levels of risk factors. This estimation is complicated by the fact that in such studies the selection probabilities for the cases and controls are unequal. A further complication arises when the subjects who are selected into the study do not participate (i.e. become nonrespondents) and nonrespondents differ systematically from respondents. In this paper, we show how to account for unequal selection probabilities as well as differential nonresponses in the incidence estimation. We use two logistic models, one relating the disease incidence rate to the risk factors, and one modelling the predictors that affect the nonresponse probability. After estimating the regression parameters in the nonresponse model, we estimate the regression parameters in the disease incidence model by a weighted estimating function that weights a respondent's contribution to the likelihood score function by the inverse of the product of his/her selection probability and his/her model‐predicted response probability. The resulting estimators of the regression parameters and the corresponding estimators of the incidence rates are shown to be consistent and asymptotically normal with easily estimated variances. Simulation results demonstrate that the asymptotic approximations are adequate for practical use and that failure to adjust for nonresponses could result in severe biases. An illustration with data from a cardiovascular study that motivated this work is presented.  相似文献   

7.
Modern high‐throughput proteomic platforms allow incomparable protein mixture resolution and identification. However, such sophisticated facilities are expensive and not always accessible for routine analysis of simple mixtures. In this paper, we propose a simple methodology, based on detection of intact, nondigested proteins by LC coupled to single quadrupole MS (sqLC‐MS), followed by the analysis of the resulting spectra by multivariate analysis (MA). By doing so, even large molecular weight (MW) proteins, generating complex spectra, can be characterized to a level that allows isoform discrimination, while standard algorithms, such as MS spectrum deconvolution, cannot. To demonstrate the effectiveness of the proposed approach, we have analyzed the spectra of a set of purified, intact albumins from seven different organisms (bovine, human, rabbit, rat, sheep, mouse, and pig) as a model of microheterogenous proteins, using Projection to Latent Structure Discriminant Analysis (PLS‐DA). Although these proteins are very similar (less than 1% difference in MW), sqLC‐MS/MA allowed their classification, and the identification of unknown source samples. In addition, MA allowed precise protein quantification from the same data (calibration curve R2 = 0.9966). The ability to rapidly characterize and quantify proteins, together with simplicity and affordability, could make of combined sqLC‐MS/MA a routine method for the characterization of simple mixture of known proteins.  相似文献   

8.
Summary Meta‐analysis is a powerful approach to combine evidence from multiple studies to make inference about one or more parameters of interest, such as regression coefficients. The validity of the fixed effect model meta‐analysis depends on the underlying assumption that all studies in the meta‐analysis share the same effect size. In the presence of heterogeneity, the fixed effect model incorrectly ignores the between‐study variance and may yield false positive results. The random effect model takes into account both within‐study and between‐study variances. It is more conservative than the fixed effect model and should be favored in the presence of heterogeneity. In this paper, we develop a noniterative method of moments estimator for the between‐study covariance matrix in the random effect model multivariate meta‐analysis. To our knowledge, it is the first such method of moments estimator in the matrix form. We show that our estimator is a multivariate extension of DerSimonian and Laird’s univariate method of moments estimator, and it is invariant to linear transformations. In the simulation study, our method performs well when compared to existing random effect model multivariate meta‐analysis approaches. We also apply our method in the analysis of a real data example.  相似文献   

9.
Lu Chen  Li Hsu  Kathleen Malone 《Biometrics》2009,65(4):1105-1114
Summary The population‐based case–control study design is perhaps one of, if not the most, commonly used designs for investigating the genetic and environmental contributions to disease risk in epidemiological studies. Ages at onset and disease status of family members are routinely and systematically collected from the participants in this design. Considering age at onset in relatives as an outcome, this article is focused on using the family history information to obtain the hazard function, i.e., age‐dependent penetrance function, of candidate genes from case–control studies. A frailty‐model‐based approach is proposed to accommodate the shared risk among family members that is not accounted for by observed risk factors. This approach is further extended to accommodate missing genotypes in family members and a two‐phase case–control sampling design. Simulation results show that the proposed method performs well in realistic settings. Finally, a population‐based two‐phase case–control breast cancer study of the BRCA1 gene is used to illustrate the method.  相似文献   

10.
Robert M. Dorazio 《Biometrics》2012,68(4):1303-1312
Summary Several models have been developed to predict the geographic distribution of a species by combining measurements of covariates of occurrence at locations where the species is known to be present with measurements of the same covariates at other locations where species occurrence status (presence or absence) is unknown. In the absence of species detection errors, spatial point‐process models and binary‐regression models for case‐augmented surveys provide consistent estimators of a species’ geographic distribution without prior knowledge of species prevalence. In addition, these regression models can be modified to produce estimators of species abundance that are asymptotically equivalent to those of the spatial point‐process models. However, if species presence locations are subject to detection errors, neither class of models provides a consistent estimator of covariate effects unless the covariates of species abundance are distinct and independently distributed from the covariates of species detection probability. These analytical results are illustrated using simulation studies of data sets that contain a wide range of presence‐only sample sizes. Analyses of presence‐only data of three avian species observed in a survey of landbirds in western Montana and northern Idaho are compared with site‐occupancy analyses of detections and nondetections of these species.  相似文献   

11.
A nonparametric estimator of a joint distribution function F0 of a d‐dimensional random vector with interval‐censored (IC) data is the generalized maximum likelihood estimator (GMLE), where d ≥ 2. The GMLE of F0 with univariate IC data is uniquely defined at each follow‐up time. However, this is no longer true in general with multivariate IC data as demonstrated by a data set from an eye study. How to estimate the survival function and the covariance matrix of the estimator in such a case is a new practical issue in analyzing IC data. We propose a procedure in such a situation and apply it to the data set from the eye study. Our method always results in a GMLE with a nonsingular sample information matrix. We also give a theoretical justification for such a procedure. Extension of our procedure to Cox's regression model is also mentioned.  相似文献   

12.
When a case‐control study is planned to include an internal validation study, the sample size of the study and the proportion of validated observations has to be calculated. There are a variety of alternative methods to accomplish this. In this article some possible procedures will be compared in order to clarify whether considerable differences in the suggested optimal designs occur, dependent on the used method.  相似文献   

13.
A distribution‐free two‐sample rank test is proposed for testing for differences between survival distributions in the analysis of biomedical studies in which two groups of subjects are followed over time for a particular outcome, which may recur. This method is motivated by an observational HIV (human immunodeficiency virus) study in which a group of HIV‐seropositive women and a comparable group of HIV‐seronegative women were examined every 6 months for the presence of cervical intraepithelial neoplasia (CIN), the cervical cancer precursor. Women entered the study serially and were subject to random loss to follow‐up. Only women free of CIN at study entry were followed resulting in left‐truncated survival times. If a woman is found to be CIN infected at a later examination, she is treated and then followed until CIN recurs. The two groups of women were compared at both occurrences of CIN on the basis of rank statistics. For the first occurrence of CIN, survival times since the beginning of the study (based on calendar time) are compared. For a recurrence of CIN, survival times since the first development of CIN are compared. The proposed test statistic for an overall difference between the two groups follows a chi‐square distribution with two degrees of freedom. Simulation results demonstrate the usefulness of the proposed test proposed test statistic, which reduces to the Gehan statistic if each person is followed only to the first failure and there is no serial enrollment.  相似文献   

14.
When applying the Cochran‐Armitage (CA) trend test for an association between a candidate allele and a disease in a case‐control study, a set of scores must be assigned to the genotypes. Sasieni (1997, Biometrics 53 , 1253–1261) suggested scores for the recessive, additive, and dominant models but did not examine their statistical properties. Using the criteria of minimizing the required sample size of the CA trend test to achieve prespecified type I and type II errors, we show that the scores given by Sasieni (1997) are optimal for the recessive and dominant models and locally optimal for the additive one. Moreover, the additive scores are shown to be locally optimal for the multiplicative model. The tests are applied to a real dataset.  相似文献   

15.
家族高发性2型糖尿病的遗传模式研究   总被引:2,自引:0,他引:2  
王劲松  周玲  成金罗  沈默宇 《遗传》2003,25(6):637-640
对1999~2000年门诊及住院的家族高发性2型糖尿病患者为先证者的136个大家系进行研究,以探讨该病的遗传模式。对家系人群采用Falconer 法估算遗传率,用Penrose法进行多基因分析,并用S.A.G.E-REGD软件拟合A型回归Logistic模型进行复合分离分析的方法,对家族高发性2型糖尿病家系进行研究。结果表明,136个大家系的2型糖尿病遗传率为94.07%±5.84%,提示在这些家系中可能有显性主基因存在。多基因分析研究表明,在该人群中,2型糖尿病因性别不同而存在两种遗传模式。复合分离分析拒绝单纯环境模型、非传递模型、共显性模型,接受隐性模型和显性模型,但隐性模型为最佳遗传模式。因此,2型糖尿病具有高度的遗传性和遗传异质性,总体表现为多因子遗传,在部分遗传背景较一致的家系人群中可能存在由主要基因决定的常染色体显性遗传。 Abstract:This study is to explore the genetic model of type 2 diabetes mellitus (type 2 DM) among the hereditary family.One hundred and thirty-six pedigrees of familial type 2 DM were studied.The heritability of type 2 DM was estimated according to Falconer's method and the multi-factorial inheritance analyzed according to Penrose's method.Complex segregation analysis was performed using S.A.G.E-REGD.The heritability of familial type 2 DM was 9407%±5.84%.Dominant major gene might influence the genesis of type 2 DM.Analysis of multi-factorial inheritance indicated that there be two genetic patterns respectively in male and female populations.By complex segregation analysis,environment,non-transmitted and co-dominant inheritance were rejected.Autosomal dominant (AD) inheritance and autosomal recessive (AR) inheritance was accepted but AR inheritance was the best pattern.This study suggested that type 2 DM had significant heritability and genetic heterogeneity,which appeared to be a disease of multi-factorial inheritance generally and autosomal dominant (AD) inheritance in part of pedigrees.  相似文献   

16.
Case‐control studies are primary study designs used in genetic association studies. Sasieni (Biometrics 1997, 53, 1253–1261) pointed out that the allelic chi‐square test used in genetic association studies is invalid when Hardy‐Weinberg equilibrium (HWE) is violated in a combined population. It is important to know how much type I error rate is deviated from the nominal level under violated HWE. We examine bounds of type I error rate of the allelic chi‐square test. We also investigate power of the goodness‐of‐fit test for HWE which can be used as a guideline for selecting an appropriate test between the allelic chi‐square test and the modified allelic chi‐square test, the latter of which was proposed for cases of violated HWE. In small samples, power is not large enough to detect the Wright's inbreeding model of small values of inbreeding coefficient. Therefore, when the null hypothesis of HWE is barely accepted, the modified test should be considered as an alternative method. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

17.
The application of stabilized multivariate tests is demonstrated in the analysis of a two‐stage adaptive clinical trial with three treatment arms. Due to the clinical problem, the multiple comparisons include tests of superiority as well as a test for non‐inferiority, where non‐inferiority is (because of missing absolute tolerance limits) expressed as linear contrast of the three treatments. Special emphasis is paid to the combination of the three sources of multiplicity – multiple endpoints, multiple treatments, and two stages of the adaptive design. Particularly, the adaptation after the first stage comprises a change of the a‐priori order of hypotheses.  相似文献   

18.
Modelling heterogeneity of capture is an important problem in estimating animal abundance from capturerecapture data, with underestimation of abundance occurring if different animals have intrinsically high or low capture probabilities. Mixture models are useful in many cases to model the heterogeneity. We summarise mixture model results for closed populations, using a skink data set for illustration. New mixture models for heterogeneous open populations are discussed, and a closed population model is shown to have new and potentially effective applications in community analysis. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

19.
Multivariate data analysis (MVDA) is a highly valuable and significantly underutilized resource in biomanufacturing. It offers the opportunity to enhance understanding and leverage useful information from complex high‐dimensional data sets, recorded throughout all stages of therapeutic drug manufacture. To help standardize the application and promote this resource within the biopharmaceutical industry, this paper outlines a novel MVDA methodology describing the necessary steps for efficient and effective data analysis. The MVDA methodology is followed to solve two case studies: a “small data” and a “big data” challenge. In the “small data” example, a large‐scale data set is compared to data from a scale‐down model. This methodology enables a new quantitative metric for equivalence to be established by combining a two one‐sided test with principal component analysis. In the “big data” example, this methodology enables accurate predictions of critical missing data essential to a cloning study performed in the ambr15 system. These predictions are generated by exploiting the underlying relationship between the off‐line missing values and the on‐line measurements through the generation of a partial least squares model. In summary, the proposed MVDA methodology highlights the importance of data pre‐processing, restructuring, and visualization during data analytics to solve complex biopharmaceutical challenges.  相似文献   

20.
Summary In individually matched case–control studies, when some covariates are incomplete, an analysis based on the complete data may result in a large loss of information both in the missing and completely observed variables. This usually results in a bias and loss of efficiency. In this article, we propose a new method for handling the problem of missing covariate data based on a missing‐data‐induced intensity approach when the missingness mechanism does not depend on case–control status and show that this leads to a generalization of the missing indicator method. We derive the asymptotic properties of the estimates from the proposed method and, using an extensive simulation study, assess the finite sample performance in terms of bias, efficiency, and 95% confidence coverage under several missing data scenarios. We also make comparisons with complete‐case analysis (CCA) and some missing data methods that have been proposed previously. Our results indicate that, under the assumption of predictable missingness, the suggested method provides valid estimation of parameters, is more efficient than CCA, and is competitive with other, more complex methods of analysis. A case–control study of multiple myeloma risk and a polymorphism in the receptor Inter‐Leukin‐6 (IL‐6‐α) is used to illustrate our findings.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号