首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
J M Neuhaus  N P Jewell 《Biometrics》1990,46(4):977-990
Recently a great deal of attention has been given to binary regression models for clustered or correlated observations. The data of interest are of the form of a binary dependent or response variable, together with independent variables X1,...., Xk, where sets of observations are grouped together into clusters. A number of models and methods of analysis have been suggested to study such data. Many of these are extensions in some way of the familiar logistic regression model for binary data that are not grouped (i.e., each cluster is of size 1). In general, the analyses of these clustered data models proceed by assuming that the observed clusters are a simple random sample of clusters selected from a population of clusters. In this paper, we consider the application of these procedures to the case where the clusters are selected randomly in a manner that depends on the pattern of responses in the cluster. For example, we show that ignoring the retrospective nature of the sample design, by fitting standard logistic regression models for clustered binary data, may result in misleading estimates of the effects of covariates and the precision of estimated regression coefficients.  相似文献   

2.
Deschepper E  Thas O  Ottoy JP 《Biometrics》2008,64(3):912-920
Summary .   Regression diagnostics and lack-of-fit tests mainly focus on linear--linear regression models. When the design points are distributed on the circumference of a circle, difficulties arise as there is no natural starting point or origin. Most classical lack-of-fit tests require an arbitrarily chosen origin, but different choices may result in different conclusions. We propose a graphical diagnostic tool and a closely related lack-of-fit test, which does not require a natural starting point. The method is based on regional residuals which are defined on arcs of the circle. The graphical method formally locates and visualizes subsets of poorly fitting observations on the circle. A data example from the food technology is used to point out the before-mentioned problems with conventional lack-of-fit tests and to illustrate the strength of the methodology based on regional residuals in detecting and localizing departures from the no-effect hypothesis. A small simulation study shows a good performance of the regional residual test in case of both global and local deviations from the null model. Finally, the ideas are extended to the case of more than one predictor variable.  相似文献   

3.
Mapping of quantitative trait loci based on growth models   总被引:10,自引:0,他引:10  
An approach called growth model-based mapping (GMM) of quantitative trait loci (QTLs) is proposed in this paper. The principle of the approach is to fit the growth curve of each individual or line with a theoretical or empirical growth model at first and then map QTLs based on the estimated growth parameters with the method of multiple-trait composite interval mapping. In comparison with previously proposed approaches of QTL mapping based on growth data, GMM has several advantages: (1) it can greatly reduce the amount of phenotypic data for QTL analysis and thus alleviate the burden of computation, particularly when permutation tests or simulation are performed to estimate significance thresholds; (2) it can efficiently analyze unbalanced phenotype data because both balanced and unbalanced data can be used for fitting growth models; and (3) it may potentially help us to better understand the genetic basis of quantitative trait development because the parameters in a theoretical growth model may often have clear biological meanings. A practical example of rice leaf-age development is presented to demonstrate the utility of GMM.  相似文献   

4.
A proportional hazards model for interval-censored failure time data   总被引:7,自引:0,他引:7  
D M Finkelstein 《Biometrics》1986,42(4):845-854
This paper develops a method for fitting the proportional hazards regression model when the data contain left-, right-, or interval-censored observations. Results given for testing the hypothesis of a zero regression coefficient lead to a generalization of the log-rank test for comparison of several survival curves. The method is used to analyze data from an animal tumorigenicity study and also a clinical trial.  相似文献   

5.
The aim of this study was to explore, by computer simulation, the mapping of QTLs in a realistic but complex situation of many (linked) QTLs with different effects, and to compare two QTL mapping methods. A novel method to dissect genetic variation on multiple chromosomes using molecular markers in backcross and F2 populations derived from inbred lines was suggested, and its properties tested using simulations. The rationale for this sequential testing method was to explicitly test for alternative genetic models. The method consists of a series of four basic statistical tests to decide whether variance was due to a single QTL, two QTLs, multiple QTLs, or polygenes, starting with a test to detect genetic variance associated with a particular chromosome. The method was able to distinguish between different QTL configurations, in that the probability to `detect' the correct model was high, varying from 0.75 to 1. For example, for a backcross population of 200 and an overall heritability of 50%, in 78% of replicates a polygenic model was detected when that was the underlying true model. To test the method for multiple chromosomes, QTLs were simulated on 10 chromosomes, following a geometric series of allele effects, assuming positive alleles were in coupling in the founder lines For these simulations, the sequential testing method was compared to the established Multiple QTL Mapping (MQM) method. For a backcross population of 400 individuals, power to detect genetic variance was low with both methods when the heritability was 0.40. For example, the power to detect genetic variation on a chromosome on which 6 QTLs explained 12.6% of the genetic variance, was less than 60% for both methods. For a large heritability (0.90), the power of MQM to detect genetic variance and to dissect QTL configurations was generally better, due to the simultaneous fitting of markers on all chromosomes. It is concluded that when testing different QTL configurations on a single chromosome using the sequential testing procedure, regions of other chromosomes which explain a significant amount of variation should be fitted in the model of analysis. This study reinforces the need for large experiments in plants and other species if the aim of a genome scan is to dissect quantitative genetic variation.  相似文献   

6.
A General Monte Carlo Method for Mapping Multiple Quantitative Trait Loci   总被引:2,自引:0,他引:2  
R. C. Jansen 《Genetics》1996,142(1):305-311
In this paper we address the mapping of multiple quantitative trait loci (QTLs) in line crosses for which the genetic data are highly incomplete. Such complicated situations occur, for instance, when dominant markers are used or when unequally informative markers are used in experiments with outbred populations. We describe a general and flexible Monte Carlo expectation-maximization (Monte Carlo EM) algorithm for fitting multiple-QTL models to such data. Implementation of this algorithm is straightforward in standard statistical software, but computation may take much time. The method may be generalized to cope with more complex models for animal and human pedigrees. A practical example is presented, where a three-QTL model is adopted in an outbreeding situation with dominant markers. The example is concerned with the linkage between randomly amplified polymorphic DNA (RAPD) markers and QTLs for partial resistance to Fusarium oxysporum in lily.  相似文献   

7.
Ma CX  Casella G  Wu R 《Genetics》2002,161(4):1751-1762
Unlike a character measured at a finite set of landmark points, function-valued traits are those that change as a function of some independent and continuous variable. These traits, also called infinite-dimensional characters, can be described as the character process and include a number of biologically, economically, or biomedically important features, such as growth trajectories, allometric scalings, and norms of reaction. Here we present a new statistical infrastructure for mapping quantitative trait loci (QTL) underlying the character process. This strategy, termed functional mapping, integrates mathematical relationships of different traits or variables within the genetic mapping framework. Logistic mapping proposed in this article can be viewed as an example of functional mapping. Logistic mapping is based on a universal biological law that for each and every living organism growth over time follows an exponential growth curve (e.g., logistic or S-shaped). A maximum-likelihood approach based on a logistic-mixture model, implemented with the EM algorithm, is developed to provide the estimates of QTL positions, QTL effects, and other model parameters responsible for growth trajectories. Logistic mapping displays a tremendous potential to increase the power of QTL detection, the precision of parameter estimation, and the resolution of QTL localization due to the small number of parameters to be estimated, the pleiotropic effect of a QTL on growth, and/or residual correlations of growth at different ages. More importantly, logistic mapping allows for testing numerous biologically important hypotheses concerning the genetic basis of quantitative variation, thus gaining an insight into the critical role of development in shaping plant and animal evolution and domestication. The power of logistic mapping is demonstrated by an example of a forest tree, in which one QTL affecting stem growth processes is detected on a linkage group using our method, whereas it cannot be detected using current methods. The advantages of functional mapping are also discussed.  相似文献   

8.
The boundary line model was proposed to interpret biological data sets, where one variable is a biological response (e.g. crop yield) to an independent variable (e.g. available water content of the soil). The upper (or lower) boundary on a plot of the dependent variable (ordinate) against the independent variable (abscissa) represents the limiting response of the dependent variable to the independent variable value. Although the concept has been widely used, the methods proposed to define the boundary line have been subject to criticism. This is because of their ad hoc nature and lack of theoretical basis. In this article, we present a novel method for fitting the boundary line to a set of data. The method uses a censored probability distribution to interpret the data structure. The parameters of the distribution (and hence the boundary line parameters) are fitted using maximum likelihood and related confidence intervals deduced. The method is demonstrated using both simulated and real data sets.  相似文献   

9.
D Katz  D Z D'Argenio 《Biometrics》1983,39(3):621-628
Many experimental situations, including bioavailability studies, require the estimation of integrals by numerical quadrature applied to dependent variable observations with measurement error. A strategy is described for selecting values for the independent variable (e.g. time). The strategy minimizes the expectation of the square of the difference between the exact integral and the quadrature approximation. This approach was applied to simulated pharmacokinetic problems, including the estimation of bioavailability. Results indicate that the procedure is potentially useful in reducing the variance of resulting estimates and that it appears to be robust with respect to prior assumptions about model parameter values.  相似文献   

10.
Examples are given of the detection of diagnostic clues in quantitative cytology and histopathology by statistical testing, such as may be applied in image analytical procedures. Schematic and other examples are presented of the visual images analyzed by each procedure, whose limits are also discussed. The situations analyzed include increased cellularity, differences in nuclear placement patterns, uniformity of displacement, variance in nuclear diameters and chaincode variance of nuclear placement. A specific model is presented for describing or generating a series of dependent observations representing nuclear placement, based on the Box-Jenkins (ARIMA) models for decomposing a spatial or temporal series into several components. This model describes the statistical observations that are random samples from the series. Finally, one graphic example is given in which visual inspection more readily ascertains an alteration than does statistical analysis of a modest-sized sample.  相似文献   

11.
Summary In the context of a bioassay or an immunoassay, calibration means fitting a curve, usually nonlinear, through the observations collected on a set of samples containing known concentrations of a target substance, and then using the fitted curve and observations collected on samples of interest to predict the concentrations of the target substance in these samples. Recent technological advances have greatly improved our ability to quantify minute amounts of substance from a tiny volume of biological sample. This has in turn led to a need to improve statistical methods for calibration. In this article, we focus on developing calibration methods robust to dependent outliers. We introduce a novel normal mixture model with dependent error terms to model the experimental noise. In addition, we propose a reparameterization of the five parameter logistic nonlinear regression model that allows us to better incorporate prior information. We examine the performance of our methods with simulation studies and show that they lead to a substantial increase in performance measured in terms of mean squared error of estimation and a measure of the average prediction accuracy. A real data example from the HIV Vaccine Trials Network Laboratory is used to illustrate the methods.  相似文献   

12.
13.
Experimental error control is very important in quantitative trait locus (QTL) mapping. Although numerous statistical methods have been developed for QTL mapping, a QTL detection model based on an appropriate experimental design that emphasizes error control has not been developed. Lattice design is very suitable for experiments with large sample sizes, which is usually required for accurate mapping of quantitative traits. However, the lack of a QTL mapping method based on lattice design dictates that the arithmetic mean or adjusted mean of each line of observations in the lattice design had to be used as a response variable, resulting in low QTL detection power. As an improvement, we developed a QTL mapping method termed composite interval mapping based on lattice design (CIMLD). In the lattice design, experimental errors are decomposed into random errors and block-within-replication errors. Four levels of block-within-replication errors were simulated to show the power of QTL detection under different error controls. The simulation results showed that the arithmetic mean method, which is equivalent to a method under random complete block design (RCBD), was very sensitive to the size of the block variance and with the increase of block variance, the power of QTL detection decreased from 51.3% to 9.4%. In contrast to the RCBD method, the power of CIMLD and the adjusted mean method did not change for different block variances. The CIMLD method showed 1.2- to 7.6-fold higher power of QTL detection than the arithmetic or adjusted mean methods. Our proposed method was applied to real soybean (Glycine max) data as an example and 10 QTLs for biomass were identified that explained 65.87% of the phenotypic variation, while only three and two QTLs were identified by arithmetic and adjusted mean methods, respectively.  相似文献   

14.
Precision Mapping of Quantitative Trait Loci   总被引:125,自引:13,他引:112       下载免费PDF全文
Z. B. Zeng 《Genetics》1994,136(4):1457-1468
Adequate separation of effects of possible multiple linked quantitative trait loci (QTLs) on mapping QTLs is the key to increasing the precision of QTL mapping. A new method of QTL mapping is proposed and analyzed in this paper by combining interval mapping with multiple regression. The basis of the proposed method is an interval test in which the test statistic on a marker interval is made to be unaffected by QTLs located outside a defined interval. This is achieved by fitting other genetic markers in the statistical model as a control when performing interval mapping. Compared with the current QTL mapping method (i.e., the interval mapping method which uses a pair or two pairs of markers for mapping QTLs), this method has several advantages. (1) By confining the test to one region at a time, it reduces a multiple dimensional search problem (for multiple QTLs) to a one dimensional search problem. (2) By conditioning linked markers in the test, the sensitivity of the test statistic to the position of individual QTLs is increased, and the precision of QTL mapping can be improved. (3) By selectively and simultaneously using other markers in the analysis, the efficiency of QTL mapping can be also improved. The behavior of the test statistic under the null hypothesis and appropriate critical value of the test statistic for an overall test in a genome are discussed and analyzed. A simulation study of QTL mapping is also presented which illustrates the utility, properties, advantages and disadvantages of the method.  相似文献   

15.
A number of important data analysis problems in neuroscience can be solved using state-space models. In this article, we describe fast methods for computing the exact maximum a posteriori (MAP) path of the hidden state variable in these models, given spike train observations. If the state transition density is log-concave and the observation model satisfies certain standard assumptions, then the optimization problem is strictly concave and can be solved rapidly with Newton–Raphson methods, because the Hessian of the loglikelihood is block tridiagonal. We can further exploit this block-tridiagonal structure to develop efficient parameter estimation methods for these models. We describe applications of this approach to neural decoding problems, with a focus on the classic integrate-and-fire model as a key example.  相似文献   

16.
17.
Although RANSAC is proven to be robust, the original RANSAC algorithm selects hypothesis sets at random, generating numerous iterations and high computational costs because many hypothesis sets are contaminated with outliers. This paper presents a conditional sampling method, multiBaySAC (Bayes SAmple Consensus), that fuses the BaySAC algorithm with candidate model parameters statistical testing for unorganized 3D point clouds to fit multiple primitives. This paper first presents a statistical testing algorithm for a candidate model parameter histogram to detect potential primitives. As the detected initial primitives were optimized using a parallel strategy rather than a sequential one, every data point in the multiBaySAC algorithm was assigned to multiple prior inlier probabilities for initial multiple primitives. Each prior inlier probability determined the probability that a point belongs to the corresponding primitive. We then implemented in parallel a conditional sampling method: BaySAC. With each iteration of the hypothesis testing process, hypothesis sets with the highest inlier probabilities were selected and verified for the existence of multiple primitives, revealing the fitting for multiple primitives. Moreover, the updated version of the initial probability was implemented based on a memorable form of Bayes’ Theorem, which describes the relationship between prior and posterior probabilities of a data point by determining whether the hypothesis set to which a data point belongs is correct. The proposed approach was tested using real and synthetic point clouds. The results show that the proposed multiBaySAC algorithm can achieve a high computational efficiency (averaging 34% higher than the efficiency of the sequential RANSAC method) and fitting accuracy (exhibiting good performance in the intersection of two primitives), whereas the sequential RANSAC framework clearly suffers from over- and under-segmentation problems. Future work will aim at further optimizing this strategy through its application to other problems such as multiple point cloud co-registration and multiple image matching.  相似文献   

18.
Isothermal titration calorimetry (ITC) produces a differential heat signal with respect to the total titrant concentration. This feature gives ITC excellent sensitivity for studying the thermodynamics of complex biomolecular interactions in solution. Currently, numerical methods for data fitting are based primarily on indirect approaches rooted in the usual practice of formulating biochemical models in terms of integrated variables. Here, a direct approach is presented wherein ITC models are formulated and solved as numerical initial value problems for data fitting and simulation purposes. To do so, the ITC signal is cast explicitly as a first-order ordinary differential equation (ODE) with total titrant concentration as independent variable and the concentration of a bound or free ligand species as dependent variable. This approach was applied to four ligand-receptor binding and homotropic dissociation models. Qualitative analysis of the explicit ODEs offers insights into the behavior of the models that would be inaccessible to indirect methods of analysis. Numerical ODEs are also highly compatible with regression analysis. Since solutions to numerical initial value problems are straightforward to implement on common computing platforms in the biochemical laboratory, this method is expected to facilitate the development of ITC models tailored to any experimental system of interest.  相似文献   

19.
A discrete time state vector model (the Hahn model) has been used to simulate many experiments in cell kinetics. In the first paper in this series the authors described a new method to define the parameters of the Hahn model suitable for use in automatic fitting of fraction of labelled mitoses (FLM) experiments. In this paper it is shown how to compute the first three moments of the transit time distribution which arises from a Hahn model. These moments are compared analytically and numerically to the corresponding moments of the distribution the authors used to define the Hahn model. Finally, the problems involved in estimating the moments of the transit time distribution observed in fitting FLM data using a Hahn model are discussed.  相似文献   

20.
Diagnostic plots in Cox's regression model.   总被引:3,自引:0,他引:3  
C H Chen  P C Wang 《Biometrics》1991,47(3):841-850
Two diagnostic plots are presented for validating the fitting of a Cox proportional hazards model. The added variable plot is developed to assess the effect of adding a covariate to the model. The constructed variable plot is applied to detect nonlinearity of a fitted covariate. Both plots are also useful for identifying influential observations on the issues of interest. The methods are illustrated on examples of multiple myeloma and lung cancer data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号