首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Regression Smoothers for Estimating Parameters of Growth Analyses   总被引:1,自引:0,他引:1  
The objective of regression smoothers is to obtain predictedvalues of a dependent variable and its first derivative fromempirical data without having to assume any particular functionalrelationship between the dependent and independent variables.An early variant of this type of analysis, specifically naturalB-splines, was first applied to growth analyses by Parsons andHunt in 1981 (Annals of Botany 48 : 341–352, 1981). Theobject of this paper is to describe and evaluate two recentadvances in this area (cubic spline smoothers and loess smoothers)in the context of plant growth analysis and compare them tonaturalB -splines. The accuracies of these methods are evaluatedusing simulated data of a type that normally causes difficultieswith other methods. A bootstrap procedure is described thatimproves the estimate of the optimal smoother parameter. Itis shown that these smoothers can capture even subtle changesin relative growth rate. The method is then applied to growthdata ofHolcus lanatus. B -splines; cubic spline smoothers; growth analyses; Holcus lanatus ; loess; relative growth rate; RGR  相似文献   

2.
We propose a method to estimate the regression coefficients in a competing risks model where the cause-specific hazard for the cause of interest is related to covariates through a proportional hazards relationship and when cause of failure is missing for some individuals. We use multiple imputation procedures to impute missing cause of failure, where the probability that a missing cause is the cause of interest may depend on auxiliary covariates, and combine the maximum partial likelihood estimators computed from several imputed data sets into an estimator that is consistent and asymptotically normal. A consistent estimator for the asymptotic variance is also derived. Simulation results suggest the relevance of the theory in finite samples. Results are also illustrated with data from a breast cancer study.  相似文献   

3.
A high efficiency transfection protocol employing a common polycationic lipid is described. Using LipofectAMINE, a widely used transfection reagent, we transfected 293T cells with a plasmid harboring the -galactosidase (-gal) gene. The transfection efficiency was determined by direct staining for X-gal. The conventional transfection protocol achieved an efficiency of <40% while our protocol, which employs the repetition of transfection a few times, achieved an efficiency of approximately 80%. Thus, a dramatic increase in transfection efficiency can be obtained by simply repeating transfection with the use of a common polycationic lipid. This method will be useful in many molecular biological experiments.  相似文献   

4.
Approximate Thresholds of Interval Mapping Tests for Qtl Detection   总被引:2,自引:3,他引:2       下载免费PDF全文
A. Rebai  B. Goffinet    B. Mangin 《Genetics》1994,138(1):235-240
A general method is proposed for calculating approximate thresholds of interval mapping tests for quantitative trait loci (QTL) detection. Simulation results show that this method, when applied to backcross and F(2) populations, gives good approximations and is useful for any situation. Programs which calculate these thresholds for backcross, recombinant inbreds and F(2) for any given level and any chromosome with any given distribution of codominant markers were written in Fortran 77 and are available under request. The approach presented here could be used to obtain, after suitable calculations, thresholds for most segregating populations used in QTL mapping experiments.  相似文献   

5.
6.
To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI’s Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes.  相似文献   

7.
A new method for assessing the efficiency of batteries of arbitrary numbers of tests is proposed. The posterior probability of the mutagenicity of the substances studied has been estimated using discriminant analysis. The results of tests in each test system has been presented as the probability to obtain a positive result in the given test system. This has made it possible to decrease the sample size as the number of tests in the battery increased. As a result, prognostic power may be assessed even if the matrix of results is incomplete. This approach has been used to estimate the weights of evidence for mutagenic activities of 105 chemical compounds studied by means of a battery of four tests: Ames's test, the test for chromosome aberrations in vitro, the test for cytogenetic defects in vivo, and the test for dominant lethal mutations in rodents.  相似文献   

8.
生长曲线参数估计的一种新方法-优化回归组合法   总被引:3,自引:0,他引:3  
在现有文献研究的基础上,对生长曲线参数估计问题又作了进一步研究,给出了生长曲线参数估计的一种新方法优化回归组合法,该方法创造性地将最优化方法与回归方法结合在一起,利用最优化理论中的区间搜索和一维搜索,可以得到一系列c^*值,利用回归方法可求得与其相对应的一系列a和b的值.当c取最优值c时,a和b便得到最优值a^*和b^*经示例计算表明,这种参数估计法具有较高的精度,  相似文献   

9.
本文提出了在随机区组设计下利用六个世代的小区平均数估计加性-显性-二基因互作模型的各参数、检验该模型的加权最小二乘法的基本步骤。  相似文献   

10.
The maximal linear predictable combination of a set of dependent variables is defined as that linear combination maximizing the multiple correlation coefficient with the predictor set. It allows the relative importance of a number of factors to be evaluated for the joint response, rather than for the response of each dependent variable in turn. The procedure is illustrated by an example. AMS subject classification: major 62J10, 62H20; minor 62H25.  相似文献   

11.
The paper deals with discrete-time regression models to analyze multistate-multiepisode failure time data. The covariate process may include fixed and external as well as internal time dependent covariates. The effects of the covariates may differ among different kinds of failures and among successive episodes. A dynamic form of the logistic regression model is investigated and maximum likelihood estimation of the regression coefficients is discussed. In the last section we give an application of the model to the analysis of survival time after breast cancer operation.  相似文献   

12.
Quantitative predictions in computational life sciences are often based on regression models. The advent of machine learning has led to highly accurate regression models that have gained widespread acceptance. While there are statistical methods available to estimate the global performance of regression models on a test or training dataset, it is often not clear how well this performance transfers to other datasets or how reliable an individual prediction is–a fact that often reduces a user’s trust into a computational method. In analogy to the concept of an experimental error, we sketch how estimators for individual prediction errors can be used to provide confidence intervals for individual predictions. Two novel statistical methods, named CONFINE and CONFIVE, can estimate the reliability of an individual prediction based on the local properties of nearby training data. The methods can be applied equally to linear and non-linear regression methods with very little computational overhead. We compare our confidence estimators with other existing confidence and applicability domain estimators on two biologically relevant problems (MHC–peptide binding prediction and quantitative structure-activity relationship (QSAR)). Our results suggest that the proposed confidence estimators perform comparable to or better than previously proposed estimation methods. Given a sufficient amount of training data, the estimators exhibit error estimates of high quality. In addition, we observed that the quality of estimated confidence intervals is predictable. We discuss how confidence estimation is influenced by noise, the number of features, and the dataset size. Estimating the confidence in individual prediction in terms of error intervals represents an important step from plain, non-informative predictions towards transparent and interpretable predictions that will help to improve the acceptance of computational methods in the biological community.  相似文献   

13.
Screening for genetic variants that predispose individuals or their offspring to disease may be performed at the general population level or may instead be targeted at the relatives of previously identified carriers. The latter strategy has come to be known as "cascade genetic screening." Since the carrier risk of close relatives of known carriers is generally higher than the population risk, cascade screening is more efficient than population screening, in the sense that fewer individuals have to be genotyped per detected carrier. The efficacy of cascade screening, as measured by the overall proportion of carriers detected in a given population, is, however, lower than that of population-wide screening, and the respective inclusion rates vary according to the population frequency and mode of inheritance of the predisposing variants. For dominant mutations, we have developed equations that allow the inclusion rates of cascade screening to be calculated in an iterative fashion, depending upon screening depth and penetrance. For recessive mutations, we derived only equations for the screening of siblings and the children of patients. Owing to their mathematical complexity, it was necessary to study more extended screening strategies by simulation. Cascade screening turned out to result in low inclusion rates (<1%) when aimed at the identification of heterozygous carriers of rare recessive variants. Considerably higher rates are achievable, however, when screening is performed to detect covert homozygotes for frequent recessive mutations with reduced penetrance. This situation is exemplified by hereditary hemochromatosis, for which up to 40% of at-risk individuals may be identifiable through screening of first- to third-degree relatives of overt carriers (i.e., patients); the efficiency of this screening strategy was found to be approximately 50 times higher than that of population-wide screening. For dominant mutations, inclusion rates of cascade screening were estimated to be higher than for recessive variants. Thus, some 80% of all carriers of the factor V Leiden mutation would be detected if screening were to be targeted specifically at first- to third-degree relatives of patients with venous thrombosis. The relative cost efficiency of cascade as compared with population-wide screening (i.e., the overall savings in the extra managerial cost of the condition) is also likely to be higher for dominant than for recessive mutations. This notwithstanding, once screening has become cost-effective at the population level, it can be expected that cascade screening would only transiently represent an economically viable option.  相似文献   

14.
The linear correlation was found by the stepwise multiple regression analysis between the sensory test of soy sauce flavor and the gas Chromatographic (GLC) data transformed with arc-sine and logarithm. GLC data will possibly be used for objective evaluation of soy sauce flavor. A multiple correlation coefficient or of a determination coefficient of more than 0.9 was respectively obtained at the 5 th or 10th of 47 steps. The fact that the minimum standard error of an estimate was found at the 24th step suggests the importance of selecting proper peaks from the whole gas chromatogram. High estimated accuracy was acquired by application of GLC data to the calculated multiple regression model.  相似文献   

15.
16.
The paper deals with the random effects model, where the expectation vector and the covariance matrix of the effect influencing the population are to be estimated. The iterated estimator of expectation vector is derived, based on the invariant estimator of the combined covariance matrix, and some of its statistical properties are shown.  相似文献   

17.
We propose a method for improving the quality of signal from DNA microarrays by using several scans at varying scanner sen-sitivities. A Bayesian latent intensity model is introduced for the analysis of such data. The method improves the accuracy at which expressions can be measured in all ranges and extends the dynamic range of measured gene expression at the high end. Our method is generic and can be applied to data from any organism, for imaging with any scanner that allows varying the laser power, and for extraction with any image analysis software. Results from a self-self hybridization data set illustrate an improved precision in the estimation of the expression of genes compared to what can be achieved by applying standard methods and using only a single scan.  相似文献   

18.
Penalized Multiple Regression (PMR) can be used to discover novel disease associations in GWAS datasets. In practice, proposed PMR methods have not been able to identify well-supported associations in GWAS that are undetectable by standard association tests and thus these methods are not widely applied. Here, we present a combined algorithmic and heuristic framework for PUMA (Penalized Unified Multiple-locus Association) analysis that solves the problems of previously proposed methods including computational speed, poor performance on genome-scale simulated data, and identification of too many associations for real data to be biologically plausible. The framework includes a new minorize-maximization (MM) algorithm for generalized linear models (GLM) combined with heuristic model selection and testing methods for identification of robust associations. The PUMA framework implements the penalized maximum likelihood penalties previously proposed for GWAS analysis (i.e. Lasso, Adaptive Lasso, NEG, MCP), as well as a penalty that has not been previously applied to GWAS (i.e. LOG). Using simulations that closely mirror real GWAS data, we show that our framework has high performance and reliably increases power to detect weak associations, while existing PMR methods can perform worse than single marker testing in overall performance. To demonstrate the empirical value of PUMA, we analyzed GWAS data for type 1 diabetes, Crohns''s disease, and rheumatoid arthritis, three autoimmune diseases from the original Wellcome Trust Case Control Consortium. Our analysis replicates known associations for these diseases and we discover novel etiologically relevant susceptibility loci that are invisible to standard single marker tests, including six novel associations implicating genes involved in pancreatic function, insulin pathways and immune-cell function in type 1 diabetes; three novel associations implicating genes in pro- and anti-inflammatory pathways in Crohn''s disease; and one novel association implicating a gene involved in apoptosis pathways in rheumatoid arthritis. We provide software for applying our PUMA analysis framework.  相似文献   

19.
可交换条件下多维结构回归模型总体平均处理效应的估计   总被引:2,自引:5,他引:2  
在可交换条件下,当响应变量为多维时,利用结构回归模型研究总体平均处理效应的估计。  相似文献   

20.
The purpose of this study was twofold: (1) to develop multiple regression equations for predicting computed tomography (CT) derived intra-abdominal (IAF), subcutaneous (SCF), and total (TOTF= IAF+SCF) abdominal adipose tissue areas from anthropometric measures in adult white males with a large range of age (18–71 years) and percent body fat (2.0–40.6); and (2) to validate the new and existing equations that used similar Hounsfield Units (HU) for determining IAF for estimating these fat depots. One hundred fifty-one white male subjects had IAF, SCF, and TOTF determined by a single CT scan, skinfold and circumference measures taken and body density determined. Linear intra-correlations and factor analysis procedures were used to identify variables for inclusion in stepwise multiple regression solutions. IAF was estimated from age, waist circumference, the sum of mid-thigh and lower thigh circumferences, and vertical abdominal skinfold. SCF was estimated from age, umbilicus circumference, chest and suprailiac skinfolds. TOTF was estimated from age, body mass index (BMI), chest skinfold, and umbilicus circumference. R2 for IAF, SCF, and TOTF was .73, .77, and .86 respectively. The existing and the new equations were validated on an independent sub-sample of 51 subjects. The only existing equation that met validation criteria had a validation R2 = .67 for IAF. All three new equations met validation criteria with R 2 validations of .75, .79, and .85 for IAF, SCF, and TOTF respectively. It is concluded that the new equations might be used as an inexpensive estimation of IAF, SCF, and TOTF in adult white males varying greatly in age and percent body fat.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号