首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
If a dependent variable in a regression analysis is exceptionally expensive or hard to obtain the overall sample size used to fit the model may be limited. To avoid this one may use a cheaper or more easily collected “surrogate” variable to supplement the expensive variable. The regression analysis will be enhanced to the degree the surrogate is associated with the costly dependent variable. We develop a Bayesian approach incorporating surrogate variables in regression based on a two‐stage experiment. Illustrative examples are given, along with comparisons to an existing frequentist method. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

2.
3.
4.
A candidate's formula: A curious result in Bayesian prediction   总被引:2,自引:0,他引:2  
BESAG  JULIAN 《Biometrika》1989,76(1):183
  相似文献   

5.
Bayesian curve fitting using multivariate normal mixtures   总被引:1,自引:0,他引:1  
  相似文献   

6.
7.
Testimation is considered in the problem of estimation of regression parameters. The first stage sample is used to test a (null) hypothesis that specifies initial (preassumed) values for some of the regression parameters. Linear combination of the preassumed values and the ordinary least square (OLS) estimates is considered as the estimate if the data agree with the hypothesis. Otherwise, a second sample is taken and parameters are estimated only by using OLS, based on the combined sample. The procedure protects against type II error and against taking larger samples when inference can be made from a smaller sample.  相似文献   

8.
Daniel Gianola 《Genetics》2013,194(3):573-596
Whole-genome enabled prediction of complex traits has received enormous attention in animal and plant breeding and is making inroads into human and even Drosophila genetics. The term “Bayesian alphabet” denotes a growing number of letters of the alphabet used to denote various Bayesian linear regressions that differ in the priors adopted, while sharing the same sampling model. We explore the role of the prior distribution in whole-genome regression models for dissecting complex traits in what is now a standard situation with genomic data where the number of unknown parameters (p) typically exceeds sample size (n). Members of the alphabet aim to confront this overparameterization in various manners, but it is shown here that the prior is always influential, unless np. This happens because parameters are not likelihood identified, so Bayesian learning is imperfect. Since inferences are not devoid of the influence of the prior, claims about genetic architecture from these methods should be taken with caution. However, all such procedures may deliver reasonable predictions of complex traits, provided that some parameters (“tuning knobs”) are assessed via a properly conducted cross-validation. It is concluded that members of the alphabet have a room in whole-genome prediction of phenotypes, but have somewhat doubtful inferential value, at least when sample size is such that np.  相似文献   

9.
Microarray techniques using cDNA array and comparative genomic hybridization (CGH) have been developed for several discovery applications. They are frequently applied for the prediction and diagnosis of cancer in recent years. Many studies have shown that integrating genomic data from different sources may increase the reliability of gene expression analysis results in understanding cancer progression. Therefore, developing a good prognostic model dealing simultaneously with different types of dataset is important. The challenge with these types of data is high background noise. We describe an analytical two-stage framework with a multi-parallel data analysis method named wavelet-based generalized singular value decomposition and shaving method (WGSVD-shaving). This method is proposed for de-noising and dimension-reduction during early stage prognosis modeling. We also applied a supervised gene clustering technique with penalized logistic regression with Cox-model on an integrated data. We show the accuracy of the method using a simulated dataset with a case study on Hepatocelluar Carcinoma (HCC) cDNA and CGH data. The method shows improved results from GSVD-shaving and has application in the discovery of candidate genes associated with cancer.  相似文献   

10.
11.
The N-linked glycan in immunoglobulin G is critical for the stability and function of the crystallizable fragment (Fc) region. Alteration of these protein properties upon the removal of the N-linked glycan has often been explained by the alteration of the CH2 domain orientation in the Fc region. To confirm this hypothesis, we examined the small-angle X-ray scattering (SAXS) profile of the glycosylated Fc region (gFc) and aglycosylated Fc region (aFc) in solution. Conformational characteristics of the CH2 domain orientation were validated by comparison with SAXS profiles theoretically calculated from multiple crystal structures of the Fc region with different CH2 domain orientations. The reduced chi-square values from the fitting analyses of gFc and aFc associated with the degree of openness or closure of each crystal structure, as determined from the first principal component that partially governed the variation of the CH2 domain orientation extracted by a singular value decomposition analysis. For both gFc and aFc, the best-fitted SAXS profiles corresponded to ones calculated based on the crystal structure of gFc that formed a “semi-closed” CH2 domain orientation. Collectively, the data indicated that the removal of the N-linked glycan only negligibly affected the CH2 domain orientation in solution. These findings will guide the development of methodology for the production of highly refined functional Fc variants.  相似文献   

12.
13.
Forensic age estimation is receiving growing attention from researchers in the last few years. Accurate estimates of age are needed both for identifying real age in individuals without any identity document and assessing it for human remains. The methods applied in such context are mostly based on radiological analysis of some anatomical districts and entail the use of a regression model. However, estimating chronological age by regression models leads to overestimated ages in younger subjects and underestimated ages in older ones. We introduced a full Bayesian calibration method combined with a segmented function for age estimation that relied on a Normal distribution as a density model to mitigate this bias. In this way, we were also able to model the decreasing growth rate in juveniles. We compared our new Bayesian‐segmented model with other existing approaches. The proposed method helped producing more robust and precise forecasts of age than compared models while exhibited comparable accuracy in terms of forecasting measures. Our method seemed to overcome the estimation bias also when applied to a real data set of South‐African juvenile subjects.  相似文献   

14.
It is assumed that a known, correct, linear regression model (model I) is given. Let the problem be based on a Bayesian estimation of the regression parameter so that any available a priori information regarding this parameter can be used. This Bayesian estimation is, squared loss, an optimal strategy for the overall problem, which is divided into an estimation and a design problem. For practical reasons, the effort involved in performing the experiment will be taken into account as costs. In other words, the experimental design must result in the greatest possible accuracy for a given total cost (restriction of the sample size n). The linear cost function k(x) = 1 + c (x - a)/(b - a) is used to construct costoptimal experimental designs for simple linear regression by means of V = H = [a, b] in a way similar to that used for classical optimality criteria. The complicated structures of these designs and the difficulty in determining them by a direct approach have made it appear advisable to describe an iterative procedure for the construction of cost-optimal designs.  相似文献   

15.
Bayesian inference for prevalence in longitudinal two-phase studies   总被引:1,自引:0,他引:1  
Erkanli A  Soyer R  Costello EJ 《Biometrics》1999,55(4):1145-1150
We consider Bayesian inference and model selection for prevalence estimation using a longitudinal two-phase design in which subjects initially receive a low-cost screening test followed by an expensive diagnostic test conducted on several occasions. The change in the subject's diagnostic probability over time is described using four mixed-effects probit models in which the subject-specific effects are captured by latent variables. The computations are performed using Markov chain Monte Carlo methods. These models are then compared using the deviance information criterion. The methodology is illustrated with an analysis of alcohol and drug use in adolescents using data from the Great Smoky Mountains Study.  相似文献   

16.
多重线性回归模型协方差阵扰动的影响分析   总被引:2,自引:0,他引:2  
讨论多重线性回归模型协方差阵扰动的影响分析,获得了⌒/B与⌒/B(G)的一些关系式,⌒/B是原模型参数阵B的最佳莼性无偏估计(BLUE),⌒/B(G)是协方差阵扰动后的模型参数阵B的BLUE;文章给出了度量影响大小的测度DG及其多种形式;最后的实例说明了DG在影响分析时的有效性。  相似文献   

17.
Conditions for superiority of the minimum dispersion estimator over another with respect to the covariance matrix are derived when the vector parameter of a regression model is subject to competing stochastic restrictions. The restrictions may also consist both of a deterministic part and a stochastic part.  相似文献   

18.
Bivariate mixed effects models are often used to jointly infer upon covariance matrices for both random effects ( u ) and residuals ( e ) between two different phenotypes in order to investigate the architecture of their relationship. However, these (co)variances themselves may additionally depend upon covariates as well as additional sets of exchangeable random effects that facilitate borrowing of strength across a large number of clusters. We propose a hierarchical Bayesian extension of the classical bivariate mixed effects model by embedding additional levels of mixed effects modeling of reparameterizations of u‐ level and e ‐level (co)variances between two traits. These parameters are based upon a recently popularized square‐root‐free Cholesky decomposition and are readily interpretable, each conveniently facilitating a generalized linear model characterization. Using Markov Chain Monte Carlo methods, we validate our model based on a simulation study and apply it to a joint analysis of milk yield and calving interval phenotypes in Michigan dairy cows. This analysis indicates that the e ‐level relationship between the two traits is highly heterogeneous across herds and depends upon systematic herd management factors.  相似文献   

19.
Numerous Bayesian methods of phenotype prediction and genomic breeding value estimation based on multilocus association models have been proposed. Computationally the methods have been based either on Markov chain Monte Carlo or on faster maximum a posteriori estimation. The demand for more accurate and more efficient estimation has led to the rapid emergence of workable methods, unfortunately at the expense of well-defined principles for Bayesian model building. In this article we go back to the basics and build a Bayesian multilocus association model for quantitative and binary traits with carefully defined hierarchical parameterization of Student's t and Laplace priors. In this treatment we consider alternative model structures, using indicator variables and polygenic terms. We make the most of the conjugate analysis, enabled by the hierarchical formulation of the prior densities, by deriving the fully conditional posterior densities of the parameters and using the acquired known distributions in building fast generalized expectation-maximization estimation algorithms.  相似文献   

20.
BackgroundPrevious studies have explored population-level smoking trends and the incidence of lung cancer, but none has jointly modeled them. This study modeled the relationship between smoking rate and incidence of lung cancer, by gender, in the U.S. adult population and estimated the lag time between changes in smoking trend and changes in incidence trends.MethodsThe annual total numbers of smokers, by gender, were obtained from the database of the National Health Interview Survey (NHIS) program of the Centers for Disease Control and Prevention (CDC) for the years 1976 through 2018. The population-level incidence data for lung and bronchus cancers, by gender and five-year age group, were obtained for the same years from the Surveillance, Epidemiology, and End Results (SEER) program database of the National Cancer Institute. A Bayesian joinpoint statistical model, assuming Poisson errors, was developed to explore the relationship between smoking and lung cancer incidence in the time trend.ResultsThe model estimates and predicts the rate of change of incidence in the time trend, adjusting for expected smoking rate in the population, age, and gender. It shows that smoking trend is a strong predictor of incidence trend and predicts that rates will be roughly equal for males and females in the year 2023, then the incidence rate for females will exceed that of males. In addition, the model estimates the lag time between smoking and incidence to be 8.079 years.ConclusionsBecause there is a three-year delay in reporting smoking related data and a four-year delay for incidence data, this model provides valuable predictions of smoking rate and associated lung cancer incidence before the data are available. By recognizing differing trends by gender, the model will inform gender specific aspects of public health policy related to tobacco use and its impact on lung cancer incidence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号