首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
For analyzing longitudinal binary data with nonignorable and nonmonotone missing responses, a full likelihood method is complicated algebraically, and often requires intensive computation, especially when there are many follow-up times. As an alternative, a pseudolikelihood approach has been proposed in the literature under minimal parametric assumptions. This formulation only requires specification of the marginal distributions of the responses and missing data mechanism, and uses an independence working assumption. However, this estimator can be inefficient for estimating both time-varying and time-stationary effects under moderate to strong within-subject associations among repeated responses. In this article, we propose an alternative estimator, based on a bivariate pseudolikelihood, and demonstrate in simulations that the proposed method can be much more efficient than the previous pseudolikelihood obtained under the assumption of independence. We illustrate the method using longitudinal data on CD4 counts from two clinical trials of HIV-infected patients.  相似文献   

2.
The statistical methodology for the analysis of replicated spatial point patterns in complex designs such as those including replications is fairly undeveloped. A mixed model is developed in conjunction with maximum pseudolikelihood and generalized linear mixed modeling by extending Baddeley and Turner's (2000, Australian and New Zealand Journal of Statistics 42, 283-322) work on pseudolikelihood for single patterns. A simulation experiment is performed on parameter estimation. Fixed- and mixed-effect models are compared, and in some respects the mixed model is found to be superior. An example using the Strauss process for modeling neuron locations in post-mortem brain slices is shown.  相似文献   

3.
Phylogenetic networks are necessary to represent the tree of life expanded by edges to represent events such as horizontal gene transfers, hybridizations or gene flow. Not all species follow the paradigm of vertical inheritance of their genetic material. While a great deal of research has flourished into the inference of phylogenetic trees, statistical methods to infer phylogenetic networks are still limited and under development. The main disadvantage of existing methods is a lack of scalability. Here, we present a statistical method to infer phylogenetic networks from multi-locus genetic data in a pseudolikelihood framework. Our model accounts for incomplete lineage sorting through the coalescent model, and for horizontal inheritance of genes through reticulation nodes in the network. Computation of the pseudolikelihood is fast and simple, and it avoids the burdensome calculation of the full likelihood which can be intractable with many species. Moreover, estimation at the quartet-level has the added computational benefit that it is easily parallelizable. Simulation studies comparing our method to a full likelihood approach show that our pseudolikelihood approach is much faster without compromising accuracy. We applied our method to reconstruct the evolutionary relationships among swordtails and platyfishes (Xiphophorus: Poeciliidae), which is characterized by widespread hybridizations.  相似文献   

4.
Detection bias in recessive ascertainment is generally considered to be confined in a narrow range between unbiased truncate ascertainment and single ascertainment, where methods of segregation analysis are established. While there are arguments for an extended range of analysis, a deflated detection progression below the unbiased level is still being considered as theoretical ground or ignored as sporadics. I show here a method of gauging the ascertainment levels of surveyed data in a geometric continuum. The method is valid for recessive segregation at any ascertainment level and in simplex or multiplex sibships of whatever degree of truncation. Four previously published surveys are used to show conformation with real data and the existence of detection trends spanning the range from the unsuspected very depressed bias level to the inflated level above single ascertainment.  相似文献   

5.
DNA sequence copy number has been shown to be associated with cancer development and progression. Array-based comparative genomic hybridization (aCGH) is a recent development that seeks to identify the copy number ratio at large numbers of markers across the genome. Due to experimental and biological variations across chromosomes and hybridizations, current methods are limited to analyses of single chromosomes. We propose a more powerful approach that borrows strength across chromosomes and hybridizations. We assume a Gaussian mixture model, with a hidden Markov dependence structure and with random effects to allow for intertumoral variation, as well as intratumoral clonal variation. For ease of computation, we base estimation on a pseudolikelihood function. The method produces quantitative assessments of the likelihood of genetic alterations at each clone, along with a graphical display for simple visual interpretation. We assess the characteristics of the method through simulation studies and analysis of a brain tumor aCGH data set. We show that the pseudolikelihood approach is superior to existing methods both in detecting small regions of copy number alteration and in accurately classifying regions of change when intratumoral clonal variation is present. Software for this approach is available at http://www.biostat.harvard.edu/ approximately betensky/papers.html.  相似文献   

6.
Cannings and Thompson suggested conditioning on the phenotypes of the probands to correct for ascertainment in the analysis of pedigree data. The method assumes single ascertainment and can be expected to yield asymptotically biased parameter estimates except in this specific case. However, because the method is easy to apply, we investigated the degree of bias in the more typical situation of multiple ascertainment, in the hope that the bias might be small and that the method could be applied more generally. To explore the utility of conditioning on probands to correct for multiple ascertainment, we calculated the asymptotic value of the segregation ratio for two versions of the simple Mendelian segregation model on sibship data. For both versions, we found that this asymptotic value decreased approximately linearly as the ascertainment probability increased. When ascertainment was complete, the segregation-ratio estimates were zero, not just asymptotically but for finite sample size as well. In some cases, conditioning on probands actually resulted in greater parameter bias than no ascertainment correction at all. These results hold for a variety of sibship-size distributions, several modes of inheritance, and a wide range of population prevalences of affected individuals.  相似文献   

7.
A pseudolikelihood method for analyzing interval censored data   总被引:1,自引:0,他引:1  
We introduce a method based on a pseudolikelihood ratio forestimating the distribution function of the survival time ina mixed-case interval censoring model. In a mixed-case model,an individual is observed a random number of times, and at eachtime it is recorded whether an event has happened or not. Oneseeks to estimate the distribution of time to event. We usea Poisson process as the basis of a likelihood function to constructa pseudolikelihood ratio statistic for testing the value ofthe distribution function at a fixed point, and show that thisconverges under the null hypothesis to a known limit distribution,that can be expressed as a functional of different convex minorantsof a two-sided Brownian motion process with parabolic drift.Construction of confidence sets then proceeds by standard inversion.The computation of the confidence sets is simple, requiringthe use of the pool-adjacent-violators algorithm or a standardisotonic regression algorithm. We also illustrate the superiorityof the proposed method over competitors based on resamplingtechniques or on the limit distribution of the maximum pseudolikelihoodestimator, through simulation studies, and illustrate the differentmethods on a dataset involving time to HIV seroconversion ina group of haemophiliacs.  相似文献   

8.
In many observational studies, individuals are measured repeatedly over time, although not necessarily at a set of pre-specified occasions. Instead, individuals may be measured at irregular intervals, with those having a history of poorer health outcomes being measured with somewhat greater frequency and regularity. In this paper, we consider likelihood-based estimation of the regression parameters in marginal models for longitudinal binary data when the follow-up times are not fixed by design, but can depend on previous outcomes. In particular, we consider assumptions regarding the follow-up time process that result in the likelihood function separating into two components: one for the follow-up time process, the other for the outcome measurement process. The practical implication of this separation is that the follow-up time process can be ignored when making likelihood-based inferences about the marginal regression model parameters. That is, maximum likelihood (ML) estimation of the regression parameters relating the probability of success at a given time to covariates does not require that a model for the distribution of follow-up times be specified. However, to obtain consistent parameter estimates, the multinomial distribution for the vector of repeated binary outcomes must be correctly specified. In general, ML estimation requires specification of all higher-order moments and the likelihood for a marginal model can be intractable except in cases where the number of repeated measurements is relatively small. To circumvent these difficulties, we propose a pseudolikelihood for estimation of the marginal model parameters. The pseudolikelihood uses a linear approximation for the conditional distribution of the response at any occasion, given the history of previous responses. The appeal of this approximation is that the conditional distributions are functions of the first two moments of the binary responses only. When the follow-up times depend only on the previous outcome, the pseudolikelihood requires correct specification of the conditional distribution of the current outcome given the outcome at the previous occasion only. Results from a simulation study and a study of asymptotic bias are presented. Finally, we illustrate the main results using data from a longitudinal observational study that explored the cardiotoxic effects of doxorubicin chemotherapy for the treatment of acute lymphoblastic leukemia in children.  相似文献   

9.
We developed a likelihood-based method for testing for parent-of-origin effect in complex diseases. The likelihood formulations model parent-of-origin effect and allow for incorporation of ascertainment, as well as differential male and female ascertainment probabilities. The results based on simulated data indicated that the estimates of parental effect (either maternal or paternal) were biased when ascertainment was ignored or when the wrong ascertainment model was used. The exception was single ascertainment, in which we proved that ignoring ascertainment does not bias the estimation of parental effect, in a simple parent-of-origin model. These results underscore the importance of considering ascertainment models when testing for parent-of-origin effect in complex diseases.  相似文献   

10.
Feng R  Zhang H 《Human genetics》2006,119(4):429-435
Most genetic studies recruit high risk families and the discoveries are based on non-random selected groups. We must consider the consequences of this ascertainment process in order to apply the results of genetic research to the general population. In previous reports, we developed a latent variable model to assess the familial aggregation and inheritability of ordinal-scaled diseases, and found a major gene component of alcoholism after applying the model to the data from the Yale family study of comorbidity of alcoholism and anxiety (YFSCAA). In this report, we examine the ascertainment effects on parameter estimates and correct potential bias in the latent variable model. The simulation studies for various ascertainment schemes suggest that our ascertainment adjustment is necessary and effective. We also find that the estimated effects are relatively unbiased for the particular ascertainment scheme used in the YFSCAA, which assures the validity of our earlier conclusion.  相似文献   

11.
Procedures to estimate the genetic segregation parameter when ascertainment of families is incomplete, have previously relied on iterative computer algorithms since estimators with closed form are lacking. We now present the Minimum Variance Unbiased Estimator for the segregation parameter under any ascertainment probability. This estimator assumes a simple form when ascertainment is complete. We also present a simple estimator, akin to Li and Mantel's (1968) estimator, but without the restriction that ascertainment be complete. The performance of these estimators is compared with respect to asymptotic efficiency. We also provide tables that define the required number of families of a given size that need to be sampled to achieve a specific power for testing simple hypothesis on the segregation parameter.  相似文献   

12.
The ascertainment problem arises when families are sampled by a nonrandom process and some assumption about this sampling process must be made in order to estimate genetic parameters. Under classical ascertainment assumptions, estimation of genetic parameters cannot be separated from estimation of the parameters of the ascertainment process, so that any misspecification of the ascertainment process causes biases in estimation of the genetic parameters. Ewens and Shute proposed a resolution to this problem, involving conditioning the likelihood of the sample on the part of the data which is "relevant to ascertainment." The usefulness of this approach can only be assessed by examining the properties (in particular, bias and standard error) of the estimates which arise by using it for a wide range of parameter values and family size distributions and then comparing these biases and standard errors with those arising under classical ascertainment procedures. These comparisons are carried out in the present paper, and we also compare the proposed method with procedures which condition on, or ignore, parts of the data.  相似文献   

13.
Surveys of variability of homologous microsatellite loci among species reveal an ascertainment bias for microsatellite length where microsatellite loci isolated in one species tend to be longer than homologous loci in related species. Here, we take advantage of the availability of aligned human and chimpanzee genome sequences to compare length difference of homologous microsatellites for loci identified in humans to length difference for loci identified in chimpanzees. We are able to quantify ascertainment bias for a range of motifs and microsatellite lengths. Because ascertainment bias should not exist if a microsatellite selected in one species is as likely to be longer as it is to be shorter than its homologue, we propose that the nature of ascertainment bias can provide evidence for understanding how microsatellites evolve. We show that bias is greater for longer microsatellites but also that many long microsatellites have short homologues. These results are consistent with the notion that growth of long microsatellites is constrained by an upper length boundary that, when reached, sometimes results in large deletions. By evaluating ascertainment bias separately for interrupted and uninterrupted repeats we also show that long microsatellites tend to become interrupted, thereby contributing a second component of ascertainment bias. Having accounted for ascertainment bias, in agreement with results published elsewhere, we find that microsatellites in humans are longer on average than those in chimpanzees. This length difference is similar among repeat motifs but surprisingly comprises two roughly equal components, one associated with the repeats themselves and one with the flanking sequences. The differences we find can only be explained if microsatellites are both evolving directionally under a biased mutation process and are doing so at different rates in different closely related species.  相似文献   

14.
Despite increased interest in applying single nucleotide polymorphism (SNP) data to questions in natural systems, one unresolved issue is to what extent the ascertainment bias induced during the SNP discovery phase will impact available analysis methods. Although most studies addressing ascertainment bias have focused on human populations, it is not clear whether existing methods will work when applied to other species with more complex demographic histories and more significant levels of population structure. Here we present findings from an empirical approach to exploring the effect of population structure on issues of ascertainment bias in the Eastern Fence Lizard, Sceloporus undulatus. We find that frequency spectra and summary statistics were highly sensitive to SNP discovery strategy, necessitating careful selection of the initial ascertainment panel. Randomly selected ascertainment panels performed equally well as ascertainment panels chosen to jointly sample geographic, phenotypic, and genetic diversity. Geographically restricted panels resulted in larger biases. Additionally, we found existing ascertainment bias correction methods, which were not developed for geographically structured data sets, were largely effective at reducing the impact of ascertainment bias. Because bias correction methods performed well even when underlying assumptions were violated, our results suggest tools are currently available to analyze SNP data in structured populations.  相似文献   

15.
Ross EA  Moore D 《Biometrics》1999,55(3):813-819
We have developed methods for modeling discrete or grouped time, right-censored survival data collected from correlated groups or clusters. We assume that the marginal hazard of failure for individual items within a cluster is specified by a linear log odds survival model and the dependence structure is based on a gamma frailty model. The dependence can be modeled as a function of cluster-level covariates. Likelihood equations for estimating the model parameters are provided. Generalized estimating equations for the marginal hazard regression parameters and pseudolikelihood methods for estimating the dependence parameters are also described. Data from two clinical trials are used for illustration purposes.  相似文献   

16.
On resolving the ascertainment biases of the observed data in the geometric continuum vaffected-1 x P(sibship), where 0 less than v----infinity, four published ascertainments of rheumatic fever show excellent conformation with Mendelian recessive segregation, even in multiplex sibships. In two surveys in which ascertainment bias is near or a little above random sampling (v = 1), this conclusion is further corroborated by classical segregation analysis. The other two surveys have bias trends declining (v less than 1) very much below random sampling. Such levels of ascertainment bias, if defined through the ascertainment probability parameter pi, would be out of range because the range is from single ascertainment, where pi----0 to random sampling where pi = 1 and probability cannot exceed unity. Highly successful antimicrobial measures that would reduce the number of diseased sibs independent of the distribution of susceptible sibs could produce a dissociation of the gene-to-"rheumatic" relationship and thus explains the declining ascertainment bias.  相似文献   

17.
Autism is a severe developmental disorder of unknown etiology but with evidence for genetic influences. Here, we provide evidence for a genetic basis of several quantitative traits that are related to autism. These traits, from the Broader Phenotype Autism Symptom Scale (BPASS), were measured in nuclear families, each ascertained through two probands affected by autism spectrum disorder. The BPASS traits capture the continuum of severity of impairments and may be more informative for genetic studies than are the discrete diagnoses of autism that have been used by others. Using a sample of 201 nuclear families consisting of a total of 694 individuals, we implemented multivariate polygenic models with ascertainment adjustment to estimate heritabilities and genetic and environmental correlations between these traits. Our ascertainment adjustment uses conditioning on the phenotypes of probands, requires no modeling of the ascertainment process, and is applicable to multiplex ascertainment and multivariate traits. This appears to be the first such implementation for multivariate quantitative traits. The marked difference between heritability estimates of the trait for language onset with and without an ascertainment adjustment (0.08 and 0.22, respectively) shows that conclusions are sensitive to whether or not an ascertainment adjustment is used. Among the five BPASS traits that were analyzed, the traits for social motivation and range of interest/flexibility show the highest heritability (0.19 and 0.16, respectively) and also have the highest genetic correlation (0.92). This finding suggests a shared genetic basis of these two traits and that they may be most promising for future gene mapping and for extending pedigrees by phenotyping additional relatives.  相似文献   

18.
A method of historical inference that accounts for ascertainment bias is developed and applied to single-nucleotide polymorphism (SNP) data in humans. The data consist of 84 short fragments of the genome that were selected, from three recent SNP surveys, to contain at least two polymorphisms in their respective ascertainment samples and that were then fully resequenced in 47 globally distributed individuals. Ascertainment bias is the deviation, from what would be observed in a random sample, caused either by discovery of polymorphisms in small samples or by locus selection based on levels or patterns of polymorphism. The three SNP surveys from which the present data were derived differ both in their protocols for ascertainment and in the size of the samples used for discovery. We implemented a Monte Carlo maximum-likelihood method to fit a subdivided-population model that includes a possible change in effective size at some time in the past. Incorrectly assuming that ascertainment bias does not exist causes errors in inference, affecting both estimates of migration rates and historical changes in size. Migration rates are overestimated when ascertainment bias is ignored. However, the direction of error in inferences about changes in effective population size (whether the population is inferred to be shrinking or growing) depends on whether either the numbers of SNPs per fragment or the SNP-allele frequencies are analyzed. We use the abbreviation "SDL," for "SNP-discovered locus," in recognition of the genomic-discovery context of SNPs. When ascertainment bias is modeled fully, both the number of SNPs per SDL and their allele frequencies support a scenario of growth in effective size in the context of a subdivided population. If subdivision is ignored, however, the hypothesis of constant effective population size cannot be rejected. An important conclusion of this work is that, in demographic or other studies, SNP data are useful only to the extent that their ascertainment can be modeled.  相似文献   

19.
A Bayesian solution for making inferences about segregation parameters with no information about the ascertainment is presented. Inferences about the segregation probability and the probability of being sporadic are made through the posterior marginal distribution of these parameters after integrating out the ascertainment probability, the nuisance parameter. The method was tested with real and simulated data and performed well. Original Fanconi anemia data, for which no information about the ascertainment was available, were then analyzed, with results that confirmed a monogenic autosomal recessive mode of inheritance.  相似文献   

20.
The Danish Twin Registry is the oldest national twin register in the world, initiated in 1954 by ascertainment of twins born from 1870 to 1910. During a number of studies birth cohorts have been added to the register, and by the recent addition of birth cohorts from 1931 to 1952 the Registry now comprizes 127 birth cohorts of twins from 1870 to 1996, with a total of more than 65,000 twin pairs included. In all cohorts the ascertainment has been population-based and independent of the traits studied, although different procedures of ascertainment have been employed. In the oldest cohorts only twin pairs with both twins surviving to age 6 have been included while from 1931 all ascertained twins are included. The completeness of the ascertainment after adjustment for infant mortality is high, with approximately 90% ascertained up to 1968, and complete ascertainment of all liveborn twin pairs since 1968. The Danish Twin Registry is used as a source for large studies on genetic influence on aging and age-related health problems, normal variation in clinical parameters associated with the metabolic syndrome and cardiovascular diseases, and clinical studies of specific diseases. The combination of survey data with data obtained by linkage to national health related registers enables follow-up studies both of the general twin population and of twins from clinical studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号