首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Daye ZJ  Chen J  Li H 《Biometrics》2012,68(1):316-326
We consider the problem of high-dimensional regression under non-constant error variances. Despite being a common phenomenon in biological applications, heteroscedasticity has, so far, been largely ignored in high-dimensional analysis of genomic data sets. We propose a new methodology that allows non-constant error variances for high-dimensional estimation and model selection. Our method incorporates heteroscedasticity by simultaneously modeling both the mean and variance components via a novel doubly regularized approach. Extensive Monte Carlo simulations indicate that our proposed procedure can result in better estimation and variable selection than existing methods when heteroscedasticity arises from the presence of predictors explaining error variances and outliers. Further, we demonstrate the presence of heteroscedasticity in and apply our method to an expression quantitative trait loci (eQTLs) study of 112 yeast segregants. The new procedure can automatically account for heteroscedasticity in identifying the eQTLs that are associated with gene expression variations and lead to smaller prediction errors. These results demonstrate the importance of considering heteroscedasticity in eQTL data analysis.  相似文献   

2.
We present a new modification of nonlinear regression models for repeated measures data with heteroscedastic error structures by combining the transform-both-sides and weighting model from Caroll and Ruppert (1988) with the nonlinear random effects model from Lindstrom and Bates (1990). The proposed parameter estimators are a combination of pseudo maximum likelihood estimators for the transform-both-sides and weighting model and maximum likelihood (ML) or restricted maximum likelihood (REML) estimators for linear mixed effects models. The new method is investigated by analyzing simulated enzyme kinetic data published by Jones (1993).  相似文献   

3.
Summary Quantile regression, which models the conditional quantiles of the response variable given covariates, usually assumes a linear model. However, this kind of linearity is often unrealistic in real life. One situation where linear quantile regression is not appropriate is when the response variable is piecewise linear but still continuous in covariates. To analyze such data, we propose a bent line quantile regression model. We derive its parameter estimates, prove that they are asymptotically valid given the existence of a change‐point, and discuss several methods for testing the existence of a change‐point in bent line quantile regression together with a power comparison by simulation. An example of land mammal maximal running speeds is given to illustrate an application of bent line quantile regression in which this model is theoretically justified and its parameters are of direct biological interests.  相似文献   

4.
This paper extends the work of KODLIN (1967), who proposed a method for analyzing patient survival data wherein the hazard rate was linearly related to the survival time. The present paper extends Kodlin's model to permit maximum likelihood estimation of the parameters so that covariate effects are included and the slope and intercept parameters are allowed to change over fixed intervals of the time domain of study. An illustration of the method using multiple myeloma data is given and the results are compared with those of Kodlin's model, the Feigl-Zelen, Zippin-Armitage model, the exponential model, and Cox's proportional hazards model.  相似文献   

5.
This paper considers inference for the break point in the segmented regression or piece‐wise regression model. Standard likelihood theory does not apply because the break point is absent under the null hypothesis. We use results by Davies for this type of non‐standard set‐up [Biometrika 64 (1977), 247–254 and 74 (1987), 33–43] to obtain a test for the null hypothesis of no break point. A confidence interval can be constructed provided replicate data are available. The methods are exemplified using two longitudinal datasets, the one from ecology, the other from pharmacology.  相似文献   

6.
选择回归方程自变量的条件数法及其在RK手术中的应用   总被引:1,自引:1,他引:1  
选择合适的自变量是确定线性回归模型的首要问题,本文以消除自变量之间的复共线性为目标,介绍了一种选择回归方程自变量的条件数法,并在RK手术的结果预测问题中采用了这一方法。  相似文献   

7.
We propose a robust curve and surface estimator based on M-typeestimators and penalty-based smoothing. This approach also includesan application to wavelet regression. The concept of pseudodata, a transformation of the robust additive model to the onewith bounded errors, is used to derive some theoretical propertiesand also motivate a computational algorithm. The resulting algorithm,termed the es-algorithm, is computationally fast and providesa simple way of choosing the amount of smoothing. Moreover,it is easily described, straightforwardly implemented and canbe extended to other wavelet regression settings such as irregularlyspaced data and image denoising. Results from a simulation studyand real data examples demonstrate the promising empirical propertiesof the proposed approach.  相似文献   

8.
9.
Consider the two linear regression models of Yij on Xij, namely Yij = βio + βil Xij + εij,j = 1,2,…,ni, i = 1,2, where εij are assumed to be normally distributed with zero mean and common unknown variance σ2. The estimated value of a mean of Y1 for a given value of X1 is made to depend on a preliminary test of significance of the hypothesis β11 = β21. The bias and the mean square error of the estimator for the conditional mean of Y1 are given. The relative efficiency of the estimator to the usual estimator is computed and is used to determine a proper choice of the significance level of the preliminary test.  相似文献   

10.
We investigated the temporal relationship between abdominal temperature, physical activity, perineal swelling, and urinary progesterone and estradiol concentrations over the menstrual cycle in unrestrained captive baboons. Using a miniature temperature‐sensitive data logger surgically implanted in the abdominal cavity and an activity data logger implanted subcutaneously on the trunk, we measured, continuously over 6 months at 10‐min intervals, abdominal temperature and physical activity patterns in four female adult baboons Papio hamadryas ursinus (12.9–19.9 kg), in cages in an indoor animal facility (22–25°C). We monitored menstrual bleeding and perineal swelling changes, and measured urinary progesterone and estradiol concentrations, daily for up to 6 months, to ascertain the stage and length of the menstrual cycle. The menstrual cycle was 36 ± 2 days (mean ± SD) long and the baboons exhibited cyclic changes in perineal swellings, abdominal temperature, physical activity, urinary progesterone, and estradiol concentrations over the cycle. Mean 24‐hr abdominal temperature during the luteal phase was significantly higher than during the periovulatory phase (ANOVA, F(2, 9) = 4.7; P = 0.04), but not different to that during the proliferative phase. Physical activity followed a similar pattern, with mean 24‐hr physical activity almost twice as high in the luteal than in the periovulatory phase (ANOVA, P = 0.58; F(2, 12) = 5.8). We have characterized correlates of the menstrual cycle in baboons and shown, for the first time, a rhythm of physical activity and abdominal temperature over the menstrual cycle, with a nadir of temperature and activity at ovulation. Am. J. Primatol. 74:1143‐1153, 2012. © 2012 Wiley Periodicals, Inc.  相似文献   

11.
ABSTRACT Detection distance is an important and common auxiliary variable measured during avian point count surveys. Distance data are used to determine the area sampled and to model the detection process using distance sampling theory. In densely forested habitats, visual detections of birds are rare, and most estimates of detection distance are based on auditory cues. Distance sampling theory assumes detection distances are measured accurately, but empirical validation of this assumption for auditory detections is lacking. We used a song playback system to simulate avian point counts with known distances in a forested habitat to determine the error structure of distance estimates based on auditory detections. We conducted field evaluations with 6 experienced observers both before and after distance estimation training. We conducted additional studies to determine the effect of height and speaker orientation (toward or away from observers) on distance estimation error. Distance estimation errors for all evaluations were substantial, although training reduced errors and bias in distance estimates by approximately 15%. Measurement errors showed a nonlinear relationship to distance. Our results suggest observers were not able to differentiate distances beyond 65 m. The height from which we played songs had no effect on distance estimation errors in this habitat. The orientation of the song source did have a large effect on distance estimation errors; observers generally doubled their distance estimates for songs played away from them compared with distance estimates for songs played directly toward them. These findings, which we based on realistic field conditions, suggest measures of uncertainty in distance estimates to auditory detections are substantially higher than assumed by most researchers. This means aural point count estimates of avian abundance based on distance methods deserve careful scrutiny because they are likely biased.  相似文献   

12.
Summary We consider penalized linear regression, especially for “large p, small n” problems, for which the relationships among predictors are described a priori by a network. A class of motivating examples includes modeling a phenotype through gene expression profiles while accounting for coordinated functioning of genes in the form of biological pathways or networks. To incorporate the prior knowledge of the similar effect sizes of neighboring predictors in a network, we propose a grouped penalty based on the Lγ ‐norm that smoothes the regression coefficients of the predictors over the network. The main feature of the proposed method is its ability to automatically realize grouped variable selection and exploit grouping effects. We also discuss effects of the choices of the γ and some weights inside the Lγ ‐norm. Simulation studies demonstrate the superior finite‐sample performance of the proposed method as compared to Lasso, elastic net, and a recently proposed network‐based method. The new method performs best in variable selection across all simulation set‐ups considered. For illustration, the method is applied to a microarray dataset to predict survival times for some glioblastoma patients using a gene expression dataset and a gene network compiled from some Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways.  相似文献   

13.
14.
Basal Body Assembly in Ciliates: The Power of Numbers   总被引:1,自引:0,他引:1  
Centrioles perform the dual functions of organizing both centrosomes and cilia. The biogenesis of nascent centrioles is an essential cellular event that is tightly coupled to the cell cycle so that each cell contains only two or four centrioles at any given point in the cell cycle. The assembly of centrioles and their analogs, basal bodies, is well characterized at the ultrastructural level whereby structural modules are built into a functional organelle. Genetic studies in model organisms combined with proteomic, bioinformatic and identifying ciliary disease gene orthologs have revealed a wealth of molecules requiring further analysis to determine their roles in centriole duplication, assembly and function. Nonetheless, at this stage, our understanding of how molecular components interact to build new centrioles and basal bodies is limited. The ciliates, Tetrahymena and Paramecium , historically have been the subject of cytological and genetic study of basal bodies. Recent advances in the ciliate genetic and molecular toolkit have placed these model organisms in a favorable position to study the molecular mechanisms of centriole and basal body assembly.  相似文献   

15.
For a linear regression model with random coefficients, this paper considers the estimation of the mean of coefficient vector which, in turn, involves the estimation of variances of random coefficients. The conventional estimation methods for it sometimes provides negative estimates. In order to circumvent this kind of difficulty, a proposal is forwarded and is examined in the light of existing ones.  相似文献   

16.
A linear regression approach is presented for the statistical analysis of dose-response curves obtained by measuring the colony-forming ability of human fibroblast strains. The crucial determination of the dose range in which the linear model can be assumed is achieved by a combination of statistical criteria and biological claims. As a basic quantitative parameter we investigate the slope of the regression line and, by taking reciprocals, we retransform it into the biologically established parameter D0. Several methods for the combination of estimates are presented.  相似文献   

17.
A one-year birth cohort from Northern Finland has been followed up since 1966. As a part of this study, we are in this paper concerned with analysing the progression of myopia (nearsightness) up to the age of 20 years. The random coefficient regression model was chosen for the analysis because of the large individual variation in the development of myopia. Maximum likelihood estimates for the parameters in the model were obtained via the expectation maximization (EM) algorithm. It is shown how the estimated model can be used to predict future observations for an individual using the previously recorded refractive error measurements as well as other relevant data on the patient in question.  相似文献   

18.
This article presents a method to test the presence of relatively small systematic measurement errors; e.g., those caused by inaccurate calibration or sensor drift. To do this, primary measurements-flow rates and concentrations-are first translated into observed conversions, which should satisfy several constraints, like the laws of conservation of chemical elements. This study considers three objectives: 1.Modification of the commonly used balancing technique to improve error sensitivity to be able to detect small systematic errors. To this end, the balancing technique is applied sequentially in time.2.Extension of the method to enable direct diagnosis of errors in the primary measurements instead of diagnosing errors in the observed conversions. This was achieved by analyzing how individual errors in the primary measurements are expressed in the residual vector.3.Derivation of a new systematic method to quantitatively determine the sensitivity of the error, is that error size at which the expected value of the chisquare test function equals its critical value.The method is applied to industrial data demonstrating the effectiveness of the approach. It was shown that, for most possible error sources, a systematic errors of 2% to 5% could be detected. In given application, the variation of the N-content of biomass was appointed to be the cause of errors. (c) 1994 John Wiley & Sons, Inc.  相似文献   

19.
Summary Clinicians are often interested in the effect of covariates on survival probabilities at prespecified study times. Because different factors can be associated with the risk of short‐ and long‐term failure, a flexible modeling strategy is pursued. Given a set of multiple candidate working models, an objective methodology is proposed that aims to construct consistent and asymptotically normal estimators of regression coefficients and average prediction error for each working model, that are free from the nuisance censoring variable. It requires the conditional distribution of censoring given covariates to be modeled. The model selection strategy uses stepup or stepdown multiple hypothesis testing procedures that control either the proportion of false positives or generalized familywise error rate when comparing models based on estimates of average prediction error. The context can actually be cast as a missing data problem, where augmented inverse probability weighted complete case estimators of regression coefficients and prediction error can be used ( Tsiatis, 2006 , Semiparametric Theory and Missing Data). A simulation study and an interesting analysis of a recent AIDS trial are provided.  相似文献   

20.
The model used in this paper is Y = Xβ, where with unknown x0. Estimators of x0 are derived by putting βmx0m+1 regarding βm+1 as a new unknown parameter. Formally we use the model Y = X1β+ + e where β′+ = (β0, …βm+1 and Then βm+1/ βm is a point estimator of x0. Assuming normality for e and taking the random variable z=βmx0m+1 we get a t-distributed variable and finally a confidence estimator of x0. The formulas are applied in dose response relations in antibiotic assays refering to a standard. Now we can take into account not only the dependence on the dose/concentration but also on the position on the test agar plate where the test solution is filled in. As a consequence the confidence interval of the unknown dose/concentration x0 becomes shorter and by it the statements more precise.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号