共查询到20条相似文献,搜索用时 15 毫秒
1.
Noncrossing quantile regression curve estimation 总被引:4,自引:0,他引:4
Since quantile regression curves are estimated individually, the quantile curves can cross, leading to an invalid distribution for the response. A simple constrained version of quantile regression is proposed to avoid the crossing problem for both linear and nonparametric quantile curves. A simulation study and a reanalysis of tropical cyclone intensity data shows the usefulness of the procedure. Asymptotic properties of the estimator are equivalent to the typical approach under standard conditions, and the proposed estimator reduces to the classical one if there is no crossing. The performance of the constrained estimator has shown significant improvement by adding smoothing and stability across the quantile levels. 相似文献
2.
Modern data-rich analyses may call for fitting a large number of nonparametric quantile regressions. For example, growth charts may be constructed for each of a collection of variables, to identify those for which individuals with a disorder tend to fall in the tails of their age-specific distribution; such variables might serve as developmental biomarkers. When such a large set of analyses are carried out by penalized spline smoothing, reliable automatic selection of the smoothing parameter is particularly important. We show that two popular methods for smoothness selection may tend to overfit when estimating extreme quantiles as a smooth function of a predictor such as age; and that improved results can be obtained by multifold cross-validation or by a novel likelihood approach. A simulation study, and an application to a functional magnetic resonance imaging data set, demonstrate the favorable performance of our methods. 相似文献
3.
Methods of estimation in log odds ratio regression models 总被引:1,自引:0,他引:1
McCullagh's (1984, Journal of the Royal Statistical Society, Series B 46, 250-256) approximation to the conditional maximum likelihood estimator in log odds ratio regression models is shown to have negligible asymptotic bias unless the odds ratios are large and the sample sizes in individual 2 X 2 tables are very small. In application to two sets of case-control data, it yields results virtually indistinguishable from those of the conditional analysis. A generalization of the Mantel-Haenszel estimator proposed by Davis (1985, Biometrics 41, 487-495) does not approximate the conditional results nearly as well. 相似文献
4.
5.
The prevalence of overweight children in the United States has increased dramatically over the past two decades, and is creating well-known public health problems. Moreover, there is also evidence that children who are not overweight are becoming heavier. We use quantile regression models along with standard ordinary least squares (OLS) models to explore the correlates of childhood weight status and overweight as measured by the Body Mass Index (BMI). This approach allows the effects of covariates to vary depending on where in the BMI distribution a child is located. Our results indicate that OLS masks some of the important correlates of child BMI at the upper and lower tails of the weight distribution. For example, mother's education has no effect on black children, but is associated with improvements in BMI for overweight white boys and underweight white girls. Conversely, mother's cognitive aptitude has no effect on white boys, but is associated with BMI improvements for underweight black children and overweight white girls. Further, we find that underweight white children and black girls experience similar improvements in BMI as they get older, but that for black boys there is little if any association between age and BMI anywhere in the BMI distribution. 相似文献
6.
It is important to preprocess high-throughput data generated from mass spectrometry experiments in order to obtain a successful proteomics analysis. Outlier detection is an important preprocessing step. A naive outlier detection approach may miss many true outliers and instead select many non-outliers because of the heterogeneity of the variability observed commonly in high-throughput data. Because of this issue, we developed a outlier detection software program accounting for the heterogeneous variability by utilizing linear, non-linear and non-parametric quantile regression techniques. Our program was developed using the R computer language. As a consequence, it can be used interactively and conveniently in the R environment. AVAILABILITY: An R package, OutlierD, is available at the Bioconductor project at http://www.bioconductor.org 相似文献
7.
MOTIVATION: The identification of DNA copy number changes provides insights that may advance our understanding of initiation and progression of cancer. Array-based comparative genomic hybridization (array-CGH) has emerged as a technique allowing high-throughput genome-wide scanning for chromosomal aberrations. A number of statistical methods have been proposed for the analysis of array-CGH data. In this article, we consider a fused quantile regression model based on three motivations: (1) quantile regression may provide a more comprehensive picture for the ratio profile of copy numbers than the standard mean regression approach; (2) for simplicity, most available methods assume uniform spacing between neighboring clones, while incorporating the information of physical locations of clones may be helpful and (3) most current methods have a set of tuning parameters that must be carefully tuned, which introduces complexity to the implementation. RESULTS: We formulate the detection of regions of gains and losses in a fused regularized quantile regression framework, incorporating physical locations of clones. We derive an efficient algorithm that computes the entire solution path for the resulting optimization problem, and we propose a simple estimate for the complexity of the fitted model, which leads to convenient selection of the tuning parameter. Three published array-CGH datasets are used to demonstrate our approach. AVAILABILITY: R code are available at http://www.stat.lsa.umich.edu/~jizhu/code/cgh/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. 相似文献
8.
Chronic child undernutrition is a persistent problem in developing countries and has been the focus of hundreds of studies where the primary intent is to improve targeting of public health and economic development policies. In national level cross-sectional studies undernutrition is measured as child stunting and the goal is to assess differences in prevalence among population subgroups. Several types of regression modeling frameworks have been used to study childhood stunting but the literature provides little guidance in terms of statistical properties and the ease with which the results can be communicated to the policy community. We compare the results from quantile regression and ordinal regression models. The two frameworks can be linked analytically and together yield complementary insights. We find that reflecting on interpretations from both models leads to a more thorough analysis and forces the analyst to consider the policy utility of the findings. Guatemala is used as the country focus for the study. 相似文献
9.
Background
Identification of protein interaction networks has received considerable attention in the post-genomic era. The currently available biochemical approaches used to detect protein-protein interactions are all time and labour intensive. Consequently there is a growing need for the development of computational tools that are capable of effectively identifying such interactions.Results
Here we explain the development and implementation of a novel Protein-Protein Interaction Prediction Engine termed PIPE. This tool is capable of predicting protein-protein interactions for any target pair of the yeast Saccharomyces cerevisiae proteins from their primary structure and without the need for any additional information or predictions about the proteins. PIPE showed a sensitivity of 61% for detecting any yeast protein interaction with 89% specificity and an overall accuracy of 75%. This rate of success is comparable to those associated with the most commonly used biochemical techniques. Using PIPE, we identified a novel interaction between YGL227W (vid30) and YMR135C (gid8) yeast proteins. This lead us to the identification of a novel yeast complex that here we term vid30 complex (vid30c). The observed interaction was confirmed by tandem affinity purification (TAP tag), verifying the ability of PIPE to predict novel protein-protein interactions. We then used PIPE analysis to investigate the internal architecture of vid30c. It appeared from PIPE analysis that vid30c may consist of a core and a secondary component. Generation of yeast gene deletion strains combined with TAP tagging analysis indicated that the deletion of a member of the core component interfered with the formation of vid30c, however, deletion of a member of the secondary component had little effect (if any) on the formation of vid30c. Also, PIPE can be used to analyse yeast proteins for which TAP tagging fails, thereby allowing us to predict protein interactions that are not included in genome-wide yeast TAP tagging projects.Conclusion
PIPE analysis can predict yeast protein-protein interactions. Also, PIPE analysis can be used to study the internal architecture of yeast protein complexes. The data also suggests that a finite set of short polypeptide signals seem to be responsible for the majority of the yeast protein-protein interactions. 相似文献10.
Background
In a microarray experiment the difference in expression between genes on the same slide is up to 103 fold or more. At low expression, even a small error in the estimate will have great influence on the final test and reference ratios. In addition to the true spot intensity the scanned signal consists of different kinds of noise referred to as background. In order to assess the true spot intensity background must be subtracted. The standard approach to estimate background intensities is to assume they are equal to the intensity levels between spots. In the literature, morphological opening is suggested to be one of the best methods for estimating background this way. 相似文献11.
Time-varying parametric linear and time-varying nonparametric regression models as well as a time-varying nonparametric median
regression model are developed to predict the daily pollen concentration for Szeged in Hungary using previous-day meteorological
parameters and the daily pollen concentration. The models are applied to rainy days and non-rainy days, respectively. The
most important predictor is the previous-day pollen concentration level, and the only other predictor retained by a stepwise
regression procedure is the daily mean global solar flux for rainy days and the daily mean temperature for non-rainy days.
Although the variance percentage explained by these two predictors is higher for non-rainy (55.2%) days than for rainy (51.9%)
days, the prediction rate is slightly better for rainy than for non-rainy days. Nonparametric regression yields substantially
better estimates, especially for rainy days indicating a nonlinear relationship between the predictors and the pollen concentration.
The explained variance percentage is 71.4 and 64.6% for rainy and non-rainy days, respectively. Concerning the mean absolute
error, the nonparametric median regression provides the best estimate. The quantile regression shows that probability distribution
of daily ragweed concentration is much more skewed for non-rainy days, while the more concentrated probability distribution
for rainy days exhibits relatively stable ragweed pollen concentrations. The possible lowest limits of concentrations are
also calculated. Under highly favorable conditions for peak concentrations, the pollen level reaches at least 350 grains m−3 and 450 grains m−3 for rainy and non-rainy days, respectively. These values again underline the excessive ragweed pollen load over the area
of Szeged. 相似文献
12.
Ireneous N. Soyiri Daniel D. Reidpath Christophe Sarran 《International journal of biometeorology》2013,57(4):569-578
Asthma is a chronic condition of great public health concern globally. The associated morbidity, mortality and healthcare utilisation place an enormous burden on healthcare infrastructure and services. This study demonstrates a multistage quantile regression approach to predicting excess demand for health care services in the form of asthma daily admissions in London, using retrospective data from the Hospital Episode Statistics, weather and air quality. Trivariate quantile regression models (QRM) of asthma daily admissions were fitted to a 14-day range of lags of environmental factors, accounting for seasonality in a hold-in sample of the data. Representative lags were pooled to form multivariate predictive models, selected through a systematic backward stepwise reduction approach. Models were cross-validated using a hold-out sample of the data, and their respective root mean square error measures, sensitivity, specificity and predictive values compared. Two of the predictive models were able to detect extreme number of daily asthma admissions at sensitivity levels of 76 % and 62 %, as well as specificities of 66 % and 76 %. Their positive predictive values were slightly higher for the hold-out sample (29 % and 28 %) than for the hold-in model development sample (16 % and 18 %). QRMs can be used in multistage to select suitable variables to forecast extreme asthma events. The associations between asthma and environmental factors, including temperature, ozone and carbon monoxide can be exploited in predicting future events using QRMs. 相似文献
13.
Variation in nuclear DNA content across environmental gradients: a quantile regression analysis 总被引:9,自引:0,他引:9
The nuclear DNA content of angiosperms varies by several orders of magnitude. Previous studies suggest that variation in 2C DNA content (i.e. the amount of DNA in G1 phase nuclei, also referred to as the 2C-value) is correlated with environmental factors, but there are conflicting reports in the literature concerning the nature of these relationships. We examined variation in 2C DNA content for 401 species in the ecologically diverse California flora in relation to the mean July maximum temperature, January minimum temperature, and annual precipitation within the geographical ranges of these species. Species with small 2C-values predominate in all environments. Species with large 2C-values occur at intermediate July maximum temperatures, and decline in frequency at both extremes of the July temperature gradient, and with decreasing annual precipitation. Our analysis demonstrates the utility of quantile regression for statistical inference of complex distributions such as these. The method supports our observation that relationships between nuclear DNA content and environmental factors are stronger for species with large 2C-values. 相似文献
14.
For a prospective randomized clinical trial with two groups, the relative risk can be used as a measure of treatment effect and is directly interpretable as the ratio of success probabilities in the new treatment group versus the placebo group. For a prospective study with many covariates and a binary outcome (success or failure), relative risk regression may be of interest. If we model the log of the success probability as a linear function of covariates, the regression coefficients are log-relative risks. However, using such a log-linear model with a Bernoulli likelihood can lead to convergence problems in the Newton-Raphson algorithm. This is likely to occur when the success probabilities are close to one. A constrained likelihood method proposed by Wacholder (1986, American Journal of Epidemiology 123, 174-184), also has convergence problems. We propose a quasi-likelihood method of moments technique in which we naively assume the Bernoulli outcome is Poisson, with the mean (success probability) following a log-linear model. We use the Poisson maximum likelihood equations to estimate the regression coefficients without constraints. Using method of moment ideas, one can show that the estimates using the Poisson likelihood will be consistent and asymptotically normal. We apply these methods to a double-blinded randomized trial in primary biliary cirrhosis of the liver (Markus et al., 1989, New England Journal of Medicine 320, 1709-1713). 相似文献
15.
Principal component estimation for generalized linear regression 总被引:1,自引:0,他引:1
16.
Community resilience offers a conceptual framework for assessing a community's capacity for coping with environmental changes and emergency situations. It is perceived as a core element of sustainable lifestyle, helping to mitigate the community's reaction to crises by facilitating purposeful and collective action on the part of its’ members. The conjoint community resilience assessment measure (CCRAM) provides a standard measure of community resilience including five factors: leadership, collective efficacy, preparedness, place attachment, and social trust. The mean scores of each the factors portray a community resilience profile and the overall CCRAM score is calculated as the average of the scores of the 21 survey items with an equal weight.Two regression models were employed. Logistic regression, a commonly used tool in the field of applied statistics, and quantile regression, which is a non-parametric method that facilitates the detection of the effect of a regressor on various quantiles of the dependent variable.The study aims to demonstrate the innovative use of quantile regression modeling in community resilience analysis.The results demonstrate that the quantile regression was significantly more sensitive to sub-populations than the logistic regression.Having an income below average, which was negatively correlated with perceived community resilience in the logistic model was found to be significant only in the lower (Q10, Q25) resilience quantiles. Age (per year) and previous involvement in emergency situations which were not noted as significant in the logistic regression, were found to be positively associated with perceived community resilience in the lowest quantile. A difference between quantiles of perceived community resilience was noted in regard to size of community. The association between size of community and perceived community resilience which was negative in the logistic regression (residents of larger towns had lower community resilience), was found to be such only up to quantile 75, but it reversed in the highest quantile.It was concluded that the utilization of quantile regression analysis in studies of community resilience can facilitate the creation of tailored response plans, adapted to the needs of sub (such as weaker) populations and help enhance overall community resilience in crises. 相似文献
17.
Background
Vascular smooth muscle cells (VSMCs) are mature cells that play critical roles in both normal and aberrant cardiovascular conditions. In response to various environmental cues, VSMCs can dedifferentiate from a contractile state to a highly proliferative synthetic state through the so-called ‘phenotypic switching’ process. Changes in VSMC phenotype contribute to numerous vascular-related diseases, including atherosclerosis, calcification, and restenosis following angioplasty. Adventitial VSMC progenitor cells also contribute to formation of the neointima.Methods/Results
Herein, we review both, the roles of VSMC differentiation in vascular diseases, and the in vitro models used to investigate the molecular mechanisms involved in the regulation of VSMC differentiation and phenotype modulation.Conclusion
A comprehensive understanding of VSMC behavior in vascular diseases is essential to identify new therapeutic targets for the prevention and treatment of cardiovascular diseases.18.
19.
Wide cross-country variation in obesity rates has been reported between European Union member states. Although the existing cross-country differences have not been analyzed in depth, they contain important information on health production determinants. In this paper we apply a methodology for conducting standardized cross-country comparisons of body mass index (BMI). We draw on estimations of the marginal density function of BMI for Italy and Spain in 2003, two countries with similar GDP and socio-economic conditions. We produce different counterfactual distribution estimates using covariates (health production inputs) specified in a quantile regression. Our findings suggest that Spain-to-Italy BMI gaps among females are largely explained by cross-country variation in the returns to each covariate, especially for younger women. We find that adverse underlying determinants do not explain the gap observed in particular between younger Spanish females and their Italian counterfactuals; behavioural differences appear to be the key. We tentatively conclude that Spanish policy on obesity should target mainly younger females. 相似文献
20.
Minimum distance estimation for the logistic regression model 总被引:1,自引:0,他引:1