首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The present investigation compared the values obtained for the intestinal crypt cell production rate per hour ( CCPR ) at several sites in the intestine of rats using two variations in the application of metaphase arrest technique. The CCPR was determined both from the slope of a metaphase accumulation line obtained by linear regression analysis of measurements at several time points and by the single time point accumulation method. The comparison was performed for both the steady state in untreated controls and under perturbed conditions at 6, 12 and 24 h following intraperitoneal administration of 1000 mg/kg bodyweight of hydroxyurea to rats. In the steady state, the metaphase accumulation values were linear up to 3 h after vincristine sulfate in the proximal intestine (stomach to proximal ileum) and linear for up to 3 1/2 h in the distal ileum and the colon. Consequently the 3 h time point was selected for evaluation of CCPR values using the single time point method. The two methods gave equivalent results in the steady state, although in situations where there was good linearity of metaphase accumulation, the values obtained by the regression method were usually more precise. In the perturbed intestine poorer linearity of metaphase accumulation was observed and the duration of linearity was reduced sometimes to 2-2 1/2 h. Overall, under these circumstances, estimation of average CCPR was more precise by the single time point accumulation method. More importantly, significant differences were sometimes evident between the results of these two methods when applied to the same data.  相似文献   

3.
Differential gene expression detection and sample classification using microarray data have received much research interest recently. Owing to the large number of genes p and small number of samples n (p > n), microarray data analysis poses big challenges for statistical analysis. An obvious problem owing to the 'large p small n' is over-fitting. Just by chance, we are likely to find some non-differentially expressed genes that can classify the samples very well. The idea of shrinkage is to regularize the model parameters to reduce the effects of noise and produce reliable inferences. Shrinkage has been successfully applied in the microarray data analysis. The SAM statistics proposed by Tusher et al. and the 'nearest shrunken centroid' proposed by Tibshirani et al. are ad hoc shrinkage methods. Both methods are simple, intuitive and prove to be useful in empirical studies. Recently Wu proposed the penalized t/F-statistics with shrinkage by formally using the (1) penalized linear regression models for two-class microarray data, showing good performance. In this paper we systematically discussed the use of penalized regression models for analyzing microarray data. We generalize the two-class penalized t/F-statistics proposed by Wu to multi-class microarray data. We formally derive the ad hoc shrunken centroid used by Tibshirani et al. using the (1) penalized regression models. And we show that the penalized linear regression models provide a rigorous and unified statistical framework for sample classification and differential gene expression detection.  相似文献   

4.
A statistical technique is given for fitting the linear-quadratic model to experimental quantal response multifraction data using the time of the response as the end-point. The analysis used is based on the Cox Proportional Hazards model. The technique is useful for late effects where the time of occurrence of the response is dose dependent. The technique is compared to logistic regression analysis and the advantages and disadvantages are discussed. Both methods are applied to a lung pneumonitis experiment and a kidney experiment.  相似文献   

5.
6.
MOTIVATION: Recently, the temporal response of genes to changes in their environment has been investigated using cDNA microarray technology by measuring the gene expression levels at a small number of time points. Conventional techniques for time series analysis are not suitable for such a short series of time-ordered data. The analysis of gene expression data has therefore usually been limited to a fold-change analysis, instead of a systematic statistical approach. METHODS: We use the maximum likelihood method together with Akaike's Information Criterion to fit linear splines to a small set of time-ordered gene expression data in order to infer statistically meaningful information from the measurements. The significance of measured gene expression data is assessed using Student's t-test. RESULTS: Previous gene expression measurements of the cyanobacterium Synechocystis sp. PCC6803 were reanalyzed using linear splines. The temporal response was identified of many genes that had been missed by a fold-change analysis. Based on our statistical analysis, we found that about four gene expression measurements or more are needed at each time point.  相似文献   

7.
The influence of eye movement-related artifacts on electroencephalography (EEG) signals of human subjects, who were requested to perform a direction or viewing area dependent saccade task, was investigated by using a simultaneous recording with ocular potentials as electro-oculography (EOG). In the past, EOG artifact removals have been studied in tasks with a single fixation point in the screen center, with less attention to the sensitivity of cornea-retinal dipole orientations to the EEG head map. In the present study, we hypothesized the existence of a systematic EOG influence that differs according to coupling conditions of eye-movement directions with viewing areas including different fixation points. The effect was validated in the linear regression analysis by using 12 task conditions combining horizontal/vertical eye-movement direction and three segregated zones of gaze in the screen. In the first place, event-related potential topographic patterns were analyzed to compare the 12 conditions and propagation coefficients of the linear regression analysis were successively calculated in each condition. As a result, the EOG influences were significantly different in a large number of EEG channels, especially in the case of horizontal eye-movements. In the cross validation, the linear regression analysis using the appropriate dataset of the target direction/viewing area combination demonstrated an improved performance compared with the traditional methods using a single fixation at the center. This result may open a potential way to improve artifact correction methods by considering the systematic EOG influence that can be predicted according to the view angle such as using eye-tracker systems.  相似文献   

8.
ABSTRACT: BACKGROUND: Mass spectrometry (MS) data are often generated from various biological or chemical experiments and there may exist outlying observations, which are extreme due to technical reasons. The determination of outlying observations is important in the analysis of replicated MS data because elaborate pre-processing is essential for successful analysis with reliable results and manual outlier detection as one of pre-processing steps is time-consuming. The heterogeneity of variability and low replication are often obstacles to successful analysis, including outlier detection. Existing approaches, which assume constant variability, can generate many false positives (outliers) and/or false negatives non-outliers). Thus, a more powerful and accurate approach is needed to account for the heterogeneity of variability and low replication. FINDINGS: We proposed an outlier detection algorithm using projection and quantile regression in MS data from multiple experiments. The performance of the algorithm and program was demonstrated by using both simulated and real-life data. The projection approach with linear, nonlinear, or nonparametric quantile regression was appropriate in heterogeneous high-throughput data with low replication. CONCLUSION: Various quantile regression approaches combined with projection were proposed for detecting outliers. The choice among linear, nonlinear, and nonparametric regressions is dependent on the degree of heterogeneity of the data. The proposed approach was illustrated with MS data with two or more replicates.  相似文献   

9.
10.
11.
The generalized estimating equations (GEE) derived by Liang and Zeger to analyze longitudinal data have been used in a wide range of medical and biological applications. To make regression a useful and meaningful statistical tool, emphasis should be placed not only on inference or fitting, but also on diagnosing potential data problems. Most of the usual diagnostics for linear regression models have been generalized for GEE. However, global influence measures based on the volume of confidence ellipsoids are not available for GEE analysis. This article presents an extension of these measures that is valid for correlated‐measures regression analysis using GEEs. The proposed measures are illustrated by an analysis of epileptic seizure count data arising from a study of prograbide as an adjuvant therapy for partial seizures and some simulated data sets.  相似文献   

12.
13.
Organic carbon (C) associated with fine soil particles (<20 μm) is relatively stable and accounts for a large proportion of total soil organic C (SOC). The soil C saturation concept proposes a maximal amount of SOC that can be stabilized in the fine soil fraction, and the soil C saturation deficit (i.e., the difference between current SOC and the maximal amount) is presumed to affect the capacity, magnitude, and rate of SOC storage. In this study, we argue that predictions using current models underestimate maximal organic C stabilization of fine soil particles due to fundamental limitations of using least-squares linear regression. The objective was to improve predictions of maximal organic C stabilization by using two alternative approaches; one mechanistic, based on organic C loadings, and one statistical, based on boundary line analysis. We collected 342 data points on the organic C content of fine soil particles, fine particle mass proportions in bulk soil, dominant soil mineral types, and land use types from 32 studies. Predictions of maximal organic C stabilization using linear regression models are questionable because of the use of data from soils that may not be saturated in SOC and because of the nature of regression itself, resulting in a high proportion of presumed over-saturated samples. Predictions of maximal organic C stabilization using the organic C loading approach fit the data for soils dominated by 2:1 minerals well, but not soils dominated by 1:1 minerals; suggesting that the use of a single value for specific surface area, and therefore a single organic C loading, to represent a large dataset is problematic. In boundary line analysis, only data representing soils having reached the maximal amount (upper tenth percentile) were used. The boundary line analysis estimate of maximal organic C stabilization (78 ± 4 g C kg?1 fraction) was more than double the estimate by the linear regression approach (33 ± 1 g C kg?1 fraction). These results show that linear regression models do not adequately predict maximal organic C stabilization. Soil properties associated with soil mineralogy, such as specific surface area and organic C loading, should be incorporated to generate more mechanistic models for predicting soil C saturation, but in their absence, statistical models should represent the upper envelope rather than the average value.  相似文献   

14.
Asymmetric regression is an alternative to conventional linear regression that allows us to model the relationship between predictor variables and the response variable while accommodating skewness. Advantages of asymmetric regression include incorporating realistic ecological patterns observed in data, robustness to model misspecification and less sensitivity to outliers. Bayesian asymmetric regression relies on asymmetric distributions such as the asymmetric Laplace (ALD) or asymmetric normal (AND) in place of the normal distribution used in classic linear regression models. Asymmetric regression concepts can be used for process and parameter components of hierarchical Bayesian models and have a wide range of applications in data analyses. In particular, asymmetric regression allows us to fit more realistic statistical models to skewed data and pairs well with Bayesian inference. We first describe asymmetric regression using the ALD and AND. Second, we show how the ALD and AND can be used for Bayesian quantile and expectile regression for continuous response data. Third, we consider an extension to generalize Bayesian asymmetric regression to survey data consisting of counts of objects. Fourth, we describe a regression model using the ALD, and show that it can be applied to add needed flexibility, resulting in better predictive models compared to Poisson or negative binomial regression. We demonstrate concepts by analyzing a data set consisting of counts of Henslow’s sparrows following prescribed fire and provide annotated computer code to facilitate implementation. Our results suggest Bayesian asymmetric regression is an essential component of a scientist’s statistical toolbox.  相似文献   

15.
Gene expression arrays typically have 50 to 100 samples and 1000 to 20,000 variables (genes). There have been many attempts to adapt statistical models for regression and classification to these data, and in many cases these attempts have challenged the computational resources. In this article we expose a class of techniques based on quadratic regularization of linear models, including regularized (ridge) regression, logistic and multinomial regression, linear and mixture discriminant analysis, the Cox model and neural networks. For all of these models, we show that dramatic computational savings are possible over naive implementations, using standard transformations in numerical linear algebra.  相似文献   

16.
Sun W 《Biometrics》2012,68(1):1-11
RNA-seq may replace gene expression microarrays in the near future. Using RNA-seq, the expression of a gene can be estimated using the total number of sequence reads mapped to that gene, known as the total read count (TReC). Traditional expression quantitative trait locus (eQTL) mapping methods, such as linear regression, can be applied to TReC measurements after they are properly normalized. In this article, we show that eQTL mapping, by directly modeling TReC using discrete distributions, has higher statistical power than the two-step approach: data normalization followed by linear regression. In addition, RNA-seq provides information on allele-specific expression (ASE) that is not available from microarrays. By combining the information from TReC and ASE, we can computationally distinguish cis- and trans-eQTL and further improve the power of cis-eQTL mapping. Both simulation and real data studies confirm the improved power of our new methods. We also discuss the design issues of RNA-seq experiments. Specifically, we show that by combining TReC and ASE measurements, it is possible to minimize cost and retain the statistical power of cis-eQTL mapping by reducing sample size while increasing the number of sequence reads per sample. In addition to RNA-seq data, our method can also be employed to study the genetic basis of other types of sequencing data, such as chromatin immunoprecipitation followed by DNA sequencing data. In this article, we focus on eQTL mapping of a single gene using the association-based method. However, our method establishes a statistical framework for future developments of eQTL mapping methods using RNA-seq data (e.g., linkage-based eQTL mapping), and the joint study of multiple genetic markers and/or multiple genes.  相似文献   

17.
应用神经网络和多元回归技术预测森林产量   总被引:16,自引:0,他引:16  
应用传统统计技术常会因样本小和测量数据不符某种分布而受到限制。本文评价一种前馈型神经网络算法以预测落叶阔叶林产量。另外,还介绍一种由定性变为定量的数据变换方法,以用相对小的样本建立多元回归预测模型。数据变换方法有助于改善多元回归模型的预测效果。在本实验的条件下,研究结果表明神经网络技术能够产生最好的预测效果.  相似文献   

18.
自由声场下,通过给大鼠不同刺激呈现率(presentation rate,PR)的重复声刺激,用钨丝单电极记录神经元的放电信号,系统分析了下丘神经元对重复刺激的表征特性。作为发放率表征的刺激后脉冲发放数(spike count,SC)随着刺激重复不断减少,作为时间表征的首次发放潜伏期(first spike latency,FSL)逐渐延长,时间过程均呈指数形式变化。起始型神经元FSL 的时间常数大于SC,在FSL上呈现慢适应;持续性神经元FSL 的时间常数小于SC,在SC 上呈现慢适应。随着刺激呈现率PR 的增加,过渡过程的时间常数缩短,稳态SC减少,
稳态FSL延长。稳态SC 和FSL与PR呈对数线性关系,SC 的线性度更高。下丘神经元的适应性能够提高对新奇刺激的响应能力,为皮层下检测异常信息提供了可能。  相似文献   

19.
Longitudinal data usually consist of a number of short time series. A group of subjects or groups of subjects are followed over time and observations are often taken at unequally spaced time points, and may be at different times for different subjects. When the errors and random effects are Gaussian, the likelihood of these unbalanced linear mixed models can be directly calculated, and nonlinear optimization used to obtain maximum likelihood estimates of the fixed regression coefficients and parameters in the variance components. For binary longitudinal data, a two state, non-homogeneous continuous time Markov process approach is used to model serial correlation within subjects. Formulating the model as a continuous time Markov process allows the observations to be equally or unequally spaced. Fixed and time varying covariates can be included in the model, and the continuous time model allows the estimation of the odds ratio for an exposure variable based on the steady state distribution. Exact likelihoods can be calculated. The initial probability distribution on the first observation on each subject is estimated using logistic regression that can involve covariates, and this estimation is embedded in the overall estimation. These models are applied to an intervention study designed to reduce children's sun exposure.  相似文献   

20.
Designing motor vehicle safety systems requires knowledge of whole body kinematics during dynamic loading for occupants of varying size and age, often obtained from sled tests with postmortem human subjects and human volunteers. Recently, we reported pediatric and adult responses in low-speed (<4 g) automotive-like impacts, noting reductions in maximum excursion with increasing age. Since the time-based trajectory shape is also relevant for restraint design, this study quantified the time-series trajectories using basis splines and developed a statistical model for predicting trajectories as a function of body dimension or age. Previously collected trajectories of the head, spine, and pelvis were modeled using cubic basis splines with eight control points. A principal component analysis was conducted on the control points and related to erect seated height using a linear regression model. The resulting statistical model quantified how trajectories became shorter and flatter with increasing body size, corresponding to the validation data-set. Trajectories were then predicted for erect seated heights corresponding to pediatric and adult anthropomorphic test devices (ATDs), thus generating performance criteria for the ATDs based on human response. This statistical model can be used to predict trajectories for a subject of specified anthropometry and utilized in subject-specific computational models of occupant response.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号