首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Albert PS 《Biometrics》2000,56(2):602-608
Binary longitudinal data are often collected in clinical trials when interest is on assessing the effect of a treatment over time. Our application is a recent study of opiate addiction that examined the effect of a new treatment on repeated urine tests to assess opiate use over an extended follow-up. Drug addiction is episodic, and a new treatment may affect various features of the opiate-use process such as the proportion of positive urine tests over follow-up and the time to the first occurrence of a positive test. Complications in this trial were the large amounts of dropout and intermittent missing data and the large number of observations on each subject. We develop a transitional model for longitudinal binary data subject to nonignorable missing data and propose an EM algorithm for parameter estimation. We use the transitional model to derive summary measures of the opiate-use process that can be compared across treatment groups to assess treatment effect. Through analyses and simulations, we show the importance of properly accounting for the missing data mechanism when assessing the treatment effect in our example.  相似文献   

Modeling the joint distribution of a binary trait (disease) within families is a tedious challenge, owing to the lack of a general statistical model with desirable properties such as the multivariate Gaussian model for a quantitative trait. Models have been proposed that either assume the existence of an underlying liability variable, the reality of which cannot be checked, or provide estimates of aggregation parameters that are dependent on the ordering of family members and on family size. We describe how a class of copula models for the analysis of exchangeable categorical data can be incorporated into a familial framework. In this class of models, the joint distribution of binary outcomes is characterized by a function of the given marginals. This function, referred to as a "copula," depends on an aggregation parameter that is weakly dependent on the marginal distributions. We propose to decompose a nuclear family into two sets of equicorrelated data (parents and offspring), each of which is characterized by an aggregation parameter (alphaFM and alphaSS, respectively). The marginal probabilities are modeled through a logistic representation. The advantage of this model is that it provides estimates of the aggregation parameters that are independent of family size and does not require any arbitrary ordering of sibs. It can be incorporated easily into segregation or combined segregation-linkage analysis and does not require extensive computer time. As an illustration, we applied this model to a combined segregation-linkage analysis of levels of plasma angiotensin I-converting enzyme (ACE) dichotomized into two classes according to the median. The conclusions of this analysis were very similar to those we had reported in an earlier familial analysis of quantitative ACE levels.  相似文献   

MOTIVATION: Microarray designs containing millions to hundreds of millions of probes that tile entire genomes are currently being released. Within the next 2 months, our group will release a microarray data set containing over 12,000,000 microarray measurements taken from 37 mouse tissues. A problem that will become increasingly significant in the upcoming era of genome-wide exon-tiling microarray experiments is the removal of cross-hybridization noise. We present a probabilistic generative model for cross-hybridization in microarray data and a corresponding variational learning method for cross-hybridization compensation, GenXHC, that reduces cross-hybridization noise by taking into account multiple sources for each mRNA expression level measurement, as well as prior knowledge of hybridization similarities between the nucleotide sequences of microarray probes and their target cDNAs. RESULTS: The algorithm is applied to a subset of an exon-resolution genome-wide Agilent microarray data set for chromosome 16 of Mus musculus and is found to produce statistically significant reductions in cross-hybridization noise. The denoised data is found to produce enrichment in multiple gene ontology-biological process (GO-BP) functional groups. The algorithm is found to outperform robust multi-array analysis, another method for cross-hybridization compensation.  相似文献   

A probabilistic generative model for GO enrichment analysis   总被引:1,自引:0,他引:1  
The Gene Ontology (GO) is extensively used to analyze all types of high-throughput experiments. However, researchers still face several challenges when using GO and other functional annotation databases. One problem is the large number of multiple hypotheses that are being tested for each study. In addition, categories often overlap with both direct parents/descendents and other distant categories in the hierarchical structure. This makes it hard to determine if the identified significant categories represent different functional outcomes or rather a redundant view of the same biological processes. To overcome these problems we developed a generative probabilistic model which identifies a (small) subset of categories that, together, explain the selected gene set. Our model accommodates noise and errors in the selected gene set and GO. Using controlled GO data our method correctly recovered most of the selected categories, leading to dramatic improvements over current methods for GO analysis. When used with microarray expression data and ChIP-chip data from yeast and human our method was able to correctly identify both general and specific enriched categories which were overlooked by other methods.  相似文献   

Pang Z  Kuk AY 《Biometrics》2005,61(4):1076-1084
Existing distributions for modeling fetal response data in developmental toxicology such as the beta-binomial distribution have a tendency of inflating the probability of no malformed fetuses, and hence understating the risk of having at least one malformed fetus within a litter. As opposed to a shared probability extra-binomial model, we advocate a shared response model that allows a random number of fetuses within the same litter to share a common response. An explicit formula is given for the probability function and graphical plots suggest that it does not suffer from the problem of assigning too much probability to the event of no malformed fetuses. The EM algorithm can be used to estimate the model parameters. Results of a simulation study show that the EM estimates are nearly unbiased and the associated confidence intervals based on the usual standard error estimates have coverage close to the nominal level. Simulation results also suggest that the shared response model estimates of the marginal malformation probabilities are robust to misspecification of the distributional form, but not so for the estimates of intralitter correlation and the litter-level probability of having at least one malformed fetus. The proposed model is fitted to a set of data from the U.S. National Toxicology Program. For the same dose-response relationship, the fit based on the shared response distribution is superior to that based on the beta-binomial, and comparable to that based on the recently proposed q-power distribution (Kuk, 2004, Applied Statistics53, 369-386). An advantage of the shared response model over the q-power distribution is that it is more interpretable and can be extended more easily to the multivariate case. To illustrate this, a bivariate shared response model is fitted to fetal response data involving visceral and skeletal malformation.  相似文献   

Albert PS  Follmann DA  Wang SA  Suh EB 《Biometrics》2002,58(3):631-642
Longitudinal clinical trials often collect long sequences of binary data. Our application is a recent clinical trial in opiate addicts that examined the effect of a new treatment on repeated binary urine tests to assess opiate use over an extended follow-up. The dataset had two sources of missingness: dropout and intermittent missing observations. The primary endpoint of the study was comparing the marginal probability of a positive urine test over follow-up across treatment arms. We present a latent autoregressive model for longitudinal binary data subject to informative missingness. In this model, a Gaussian autoregressive process is shared between the binary response and missing-data processes, thereby inducing informative missingness. Our approach extends the work of others who have developed models that link the various processes through a shared random effect but do not allow for autocorrelation. We discuss parameter estimation using Monte Carlo EM and demonstrate through simulations that incorporating within-subject autocorrelation through a latent autoregressive process can be very important when longitudinal binary data is subject to informative missingness. We illustrate our new methodology using the opiate clinical trial data.  相似文献   

Shih JH  Albert PS 《Biometrics》1999,55(4):1232-1235
We propose a methodology for modeling correlated binary data measured with diagnostic error. A shared random effect is used to induce correlations in repeated true latent binary outcomes and in observed responses and to link the probability of a true positive outcome with the probability of having a diagnosis error. We evaluate the performance of our proposed approach through simulations and compare it with an ad hoc approach. The methodology is illustrated with data from a study that assessed the probability of corneal arcus in patients with familial hypercholesterolemia.  相似文献   

A simple method for analysing binary data   总被引:1,自引:0,他引:1  

This paper focuses on analysis of spatiotemporal binary data with absorbing states. The research was motivated by a clinical study on amyotrophic lateral sclerosis (ALS), a neurological disease marked by gradual loss of muscle strength over time in multiple body regions. We propose an autologistic regression model to capture complex spatial and temporal dependencies in muscle strength among different muscles. As it is not clear how the disease spreads from one muscle to another, it may not be reasonable to define a neighborhood structure based on spatial proximity. Relaxing the requirement for prespecification of spatial neighborhoods as in existing models, our method identifies an underlying network structure empirically to describe the pattern of spreading disease. The model also allows the network autoregressive effects to vary depending on the muscles’ previous status. Based on the joint distribution derived from this autologistic model, the joint transition probabilities of responses among locations can be estimated and the disease status can be predicted in the next time interval. Model parameters are estimated through maximization of penalized pseudo‐likelihood. Postmodel selection inference was conducted via a bias‐correction method, for which the asymptotic distributions were derived. Simulation studies were conducted to evaluate the performance of the proposed method. The method was applied to the analysis of muscle strength loss from the ALS clinical study.  相似文献   

Tan M  Qu Y  Rao JS 《Biometrics》1999,55(1):258-263
The marginal regression model offers a useful alternative to conditional approaches to analyzing binary data (Liang, Zeger, and Qaqish, 1992, Journal of the Royal Statistical Society, Series B 54, 3-40). Instead of modelling the binary data directly as do Liang and Zeger (1986, Biometrika 73, 13-22), the parametric marginal regression model developed by Qu et al. (1992, Biometrics 48, 1095-1102) assumes that there is an underlying multivariate normal vector that gives rise to the observed correlated binary outcomes. Although this parametric approach provides a flexible way to model different within-cluster correlation structures and does not restrict the parameter space, it is of interest to know how robust the parameter estimates are with respect to choices of the latent distribution. We first extend the latent modelling to include multivariate t-distributed latent vectors and assess the robustness in this class of distributions. Then we show through a simulation that the parameter estimates are robust with respect to the latent distribution even if latent distribution is skewed. In addtion to this empirical evidence for robustness, we show through the iterative algorithm that the robustness of the regression coefficents with respect to misspecifications of covariance structure in Liang and Zeger's model in fact indicates robustness with respect to underlying distributional assumptions of the latent vector in the latent variable model.  相似文献   

A model is presented for the evolution and control of generative apomixis—a collective term for apomixis in animals and diplosporous apomixis in flowering plants. Its development takes into account data obtained from studies of apomictic-like processes in sexual organisms and in non-apomictic parthenogens, as well as data obtained from studies of generative apomicts. This approach provides insights into the evolution and control of generative apomixis that cannot be obtained from studies of generative apomicts alone. It is argued that the control of the avoidance of meiotic reduction during egg production in generative apomicts resides at a single locus, the identity of which can vary between lineages. This variation accounts for the observed variation between taxa in the pattern of avoidance of meiotic reduction. The affected locus contains a wild-type allele that codes for meiotic reduction and excess copies of a mutant allele that codes for its avoidance. The dominance relationship between these is determined by their ratio and by the environment. Environmental differences between female generative cells and somatic cells are such that the phenotypic expression of the mutant allele is favoured in the former, while that of the wild-type allele is favoured in the latter. This is important, for the locus is also involved in the control of mitosis which would be disrupted by the expression of the mutant allele in somatic cells. The requirement to maintain a viable pattern of growth and development explains why the wild-type allele is retained by generative apomicts, and this in turn explains why the ability to produce meiotically reduced eggs is retained by facultative forms and why it appears to be suppressed in, rather than absent from, obligate forms. The requirement for excess copies of the mutant allele in generative cells explains why generative apomicts are typically polyploid, as this condition provides a simple and effective means of generating the correct balance of mutant and wild-type alleles. Environmental effects can also lead to the dominance relationship between wild-type and mutant alleles varying between generative cells. In plants, this can lead to the apomixis gene being expressed, and thus to meiotic reduction being avoided, in only some ovules. Meiotically reduced, as well as meiotically unreduced, eggs are produced when this occurs. If compatible and viable pollen is available the meiotically reduced eggs may be fertilized, resulting in these organisms reproducing as facultative apomicts. It is argued that the control and evolution of parthenogenesis in generative apomicts varies between taxa. In some, the parthenogenetic initiation of embryos may result from the acquisition of a parthenogenesis gene or genes; but there is no reason to believe that this is either a general or a common requirement. Indeed, in some it may be an ancestral trait, these apomicts differing from their sexual ancestors in the ability to mature, rather than in the ability to initiate, embryos from unfertilized eggs; or it may result from physiological or developmental changes induced, for example, by polyploidization, hybridization, or the avoidance of meiotic reduction. In some plants it may be induced by pollination (without fertilization) or by the activity of a developing endosperm. Although it is argued that most generatively apomictic lineages may have acquired this form of reproduction relatively easily, by the acquisition of a mutation at a single locus, it is argued that newly initiated lineages may often be reproductively inefficient. These will begin to accumulate mutations that improve the efficiency of apomictic reproduction. Thus several loci may be involved in the control of generative apomixis in established lineages, even though only a single locus was involved in its initiation in these lineages. Care must be taken to distinguish between these initiator and modifier genes when considering the evolution of generative apomixis. Finally, it is argued that although generatively apomictic lineages have easily acquired this form of reproduction, its evolution in some taxa may be so difficult, requiring the acquisition of mutations simultaneously at two or more loci, that these may never acquire it. Thus, evidence obtained from taxa that have successfully made the transition from sexual reproduction to generative apomixis that its evolution was straightforward should not be used as evidence that its evolution will always be relatively easily achieved. Its uneven taxonomic distribution indicates that it is much more easily evolved by some taxonomic groups than by others.  相似文献   

This paper proposes a method for modeling longitudinal binary data when nonresponse depends on unobserved responses. The proposed method presumes that the target of inference is the marginal distribution of the response at each occasion and its dependence on covariates, and can accommodate both monotone and non-monotone missingness. The approach involves a marginally specified pattern-mixture model that directly parameterizes both the marginal means at each occasion and the dependence of each response on indicators of nonresponse pattern. This formulation readily incorporates a variety of nonresponse processes assumed within a sensitivity analysis. Once identifying restrictions have been made, estimation of model parameters proceeds via solution to a set of modified generalized estimating equations. The proposed method provides an alternative to standard selection and pattern-mixture modeling frameworks, while featuring certain advantages of each. The paper concludes with application of the method to data from a contraceptive clinical trial with substantial dropout.  相似文献   

This paper considers the use of a multivariate binomial probit model for the analysis of correlated exchangeable binary data. The model can naturally accommodate both cluster and individual level covariates, while keeping a fairly flexible intracluster association structure. We discuss Bayesian estimation when a sample of independent clusters of varying sizes are available, and show how Gibbs sampling may be used to derive the posterior densities of parameters. The methodology is illustrated with two examples: the first involves epidemiological data from a study of familial disease aggregation; the second uses teratological data from a developmental toxicity application.  相似文献   

CM Glaze  TW Troyer 《PloS one》2012,7(7):e37616
Motor variability often reflects a mixture of different neural and peripheral sources operating over a range of timescales. We present a statistical model of sequence timing that can be used to measure three distinct components of timing variability: global tempo changes that are spread across the sequence, such as might stem from neuromodulatory sources with widespread influence; fast, uncorrelated timing noise, stemming from noisy components within the neural system; and timing jitter that does not alter the timing of subsequent elements, such as might be caused by variation in the motor periphery or by measurement error. In addition to quantifying the variability contributed by each of these latent factors in the data, the approach assigns maximum likelihood estimates of each factor on a trial-to-trial basis. We applied the model to adult zebra finch song, a temporally complex behavior with rich structure on multiple timescales. We find that individual song vocalizations (syllables) contain roughly equal amounts of variability in each of the three components while overall song length is dominated by global tempo changes. Across our sample of syllables, both global and independent variability scale with average length while timing jitter does not, a pattern consistent with the Wing and Kristofferson (1973) model of sequence timing. We also find significant day-to-day drift in all three timing sources, but a circadian pattern in tempo only. In tests using artificially generated data, the model successfully separates out the different components with small error. The approach provides a general framework for extracting distinct sources of timing variability within action sequences, and can be applied to neural and behavioral data from a wide array of systems.  相似文献   

A biophysical model is proposed for how leaf primordia are positioned on the shoot apical<br /> meristem in both spiral and whorl phyllotaxes. Primordia are initiated by signals that propagate<br /> in the epidermis in both azimuthal directions away from the cotyledons or the most recently<br /> specified primordia. The signals are linear waves as inferred from the spatial periodicity of the<br /> divergence angle and a temporal periodicity. The periods of the waves, which represent actively<br /> transported auxin, are much smaller than the plastochron interval. Where oppositely directed<br /> waves meet at one or more angular positions on the periphery of the generative circle, auxin<br /> concentration builds and as in most models this stimulates local movement of auxin to<br /> underlying cells, where it promotes polarized cell division and expansion. For higher order<br /> spirals the wave model requires asymmetric function of auxin transport; that is, opposite wave<br /> speeds differ. An algorithm for determination of the angular positions of leaves in common leaf<br /> phyllotaxic configurations is proposed. The number of turns in a pattern repeat, number of leaves<br /> per level and per pattern repeat, and divergence angle are related to speed of auxin transport and<br /> radius of the generative circle. The rule for composition of Fibonacci or Lucas numbers<br /> associated with some phyllotaxes is discussed. A subcellular model suggests how the shoot<br /> meristem might specify either symmetric or asymmetric transport of auxin away from the<br /> forming primordia that produce it. Biological tests that could make or break the mathematical<br /> and molecular hypotheses are proposed.  相似文献   

Summary .   Many longitudinal studies generate both the time to some event of interest and repeated measures data. This article is motivated by a study on patients with a renal allograft, in which interest lies in the association between longitudinal proteinuria (a dichotomous variable) measurements and the time to renal graft failure. An interesting feature of the sample at hand is that nearly half of the patients were never tested positive for proteinuria (≥1g/day) during follow-up, which introduces a degenerate part in the random-effects density for the longitudinal process. In this article we propose a two-part shared parameter model framework that effectively takes this feature into account, and we investigate sensitivity to the various dependence structures used to describe the association between the longitudinal measurements of proteinuria and the time to renal graft failure.  相似文献   

Regressive logistic models specify the probability distribution of familial binary traits by conditioning each individual's phenotype on those of preceding relatives; therefore, the expression of the joint probability of the familial data necessitates ordering the observations. In the present paper, we propose an autologistic model of this familial dependence structure, which does not require specification of a particular ordering of the phenotypic observations. Genetic effects are introduced into the model in order to perform segregation analysis that is aimed at detecting the role of a major locus in the expression of familial phenotypes. In this model, the conditional probabilities have a logistic form, and large patterns of dependence between relatives can be considered with a simple interpretation of the parameters measuring the relationship between two phenotypes. The model is compared with the regressive logistic approach in terms of odds ratios and by using a simulation study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号