共查询到13条相似文献,搜索用时 6 毫秒
1.
Summary Gene co‐expressions have been widely used in the analysis of microarray gene expression data. However, the co‐expression patterns between two genes can be mediated by cellular states, as reflected by expression of other genes, single nucleotide polymorphisms, and activity of protein kinases. In this article, we introduce a bivariate conditional normal model for identifying the variables that can mediate the co‐expression patterns between two genes. Based on this model, we introduce a likelihood ratio (LR) test and a penalized likelihood procedure for identifying the mediators that affect gene co‐expression patterns. We propose an efficient computational algorithm based on iterative reweighted least squares and cyclic coordinate descent and have shown that when the tuning parameter in the penalized likelihood is appropriately selected, such a procedure has the oracle property in selecting the variables. We present simulation results to compare with existing methods and show that the LR‐based approach can perform similarly or better than the existing method of liquid association and the penalized likelihood procedure can be quite effective in selecting the mediators. We apply the proposed method to yeast gene expression data in order to identify the kinases or single nucleotide polymorphisms that mediate the co‐expression patterns between genes. 相似文献
2.
Assessment of the misclassification error rate is of high practical relevance in many biomedical applications. As it is a complex problem, theoretical results on estimator performance are few. The origin of most findings are Monte Carlo simulations, which take place in the “normal setting”: The covariables of two groups have a multivariate normal distribution; The groups differ in location, but have the same covariance matrix and the linear discriminant function LDF is used for prediction. We perform a new simulation to compare existing nonparametric estimators in a more complex situation. The underlying distribution is based on a logistic model with six binary as well as continuous covariables. To study estimator performance for varying true error rates, three prediction rules including nonparametric classification trees and parametric logistic regression and sample sizes ranging from 100‐1,000 are considered. In contrast to most published papers we turn our attention to estimator performance based on simple, even inappropriate prediction rules and relatively large training sets. For the major part, results are in agreement with usual findings. The most strikingly behavior was seen in applying (simple) classification trees for prediction: Since the apparent error rate Êrr.app is biased, linear combinations incorporating Êrr.app underestimate the true error rate even for large sample sizes. The .632+ estimator, which was designed to correct for the overoptimism of Efron's .632 estimator for nonparametric prediction rules, performs best of all such linear combinations. The bootstrap estimator Êrr.B0 and the crossvalidation estimator Êrr.cv, which do not depend on Êrr.app, seem to track the true error rate. Although the disadvantages of both estimators – pessimism of Êrr.B0 and high variability of Êrr.cv – shrink with increased sample sizes, they are still visible. We conclude that for the choice of a particular estimator the asymptotic behavior of the apparent error rate is important. For the assessment of estimator performance the variance of the true error rate is crucial, where in general the stability of prediction procedures is essential for the application of estimators based on resampling methods. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim) 相似文献
3.
Summary It has become increasingly common in epidemiological studies to pool specimens across subjects to achieve accurate quantitation of biomarkers and certain environmental chemicals. In this article, we consider the problem of fitting a binary regression model when an important exposure is subject to pooling. We take a regression calibration approach and derive several methods, including plug‐in methods that use a pooled measurement and other covariate information to predict the exposure level of an individual subject, and normality‐based methods that make further adjustments by assuming normality of calibration errors. Within each class we propose two ways to perform the calibration (covariate augmentation and imputation). These methods are shown in simulation experiments to effectively reduce the bias associated with the naive method that simply substitutes a pooled measurement for all individual measurements in the pool. In particular, the normality‐based imputation method performs reasonably well in a variety of settings, even under skewed distributions of calibration errors. The methods are illustrated using data from the Collaborative Perinatal Project. 相似文献
4.
The question of whether selection experiments ought to include a control line, as opposed to investing all facilities in a single selected line, is addressed using a likelihood perspective. The consequences of using a control line are evaluated under two scenarios. In the first one, environmental trend is modeled and inferred from the data. In this case, a control line is shown to be highly beneficial in terms of the efficiency of inferences about eheritability and response to selection. In the second scenario, environmental trend is not modeled. One can imagine that a previous analysis of the experimental data had lent support to this decision. It is shown that in this situation where a control line may seem superfluous, inclusion of a control line can result in minor gains in efficiency if a high selection intensity is practiced in the selected line. Further, if there is a loss, it is moderately small. The results are verified to hold under more complicated data structures via Monte Carlo simulation. For completeness, divergent selection designs are also reviewed, and inferences based on a conditional and full likelihood approach are contrasted. 相似文献
5.
Summary Occupational, environmental, and nutritional epidemiologists are often interested in estimating the prospective effect of time‐varying exposure variables such as cumulative exposure or cumulative updated average exposure, in relation to chronic disease endpoints such as cancer incidence and mortality. From exposure validation studies, it is apparent that many of the variables of interest are measured with moderate to substantial error. Although the ordinary regression calibration (ORC) approach is approximately valid and efficient for measurement error correction of relative risk estimates from the Cox model with time‐independent point exposures when the disease is rare, it is not adaptable for use with time‐varying exposures. By recalibrating the measurement error model within each risk set, a risk set regression calibration (RRC) method is proposed for this setting. An algorithm for a bias‐corrected point estimate of the relative risk using an RRC approach is presented, followed by the derivation of an estimate of its variance, resulting in a sandwich estimator. Emphasis is on methods applicable to the main study/external validation study design, which arises in important applications. Simulation studies under several assumptions about the error model were carried out, which demonstrated the validity and efficiency of the method in finite samples. The method was applied to a study of diet and cancer from Harvard's Health Professionals Follow‐up Study (HPFS). 相似文献
6.
We investigate the effects of measurement error on the estimationof nonparametric variance functions. We show that either ignoringmeasurement error or direct application of the simulation extrapolation,SIMEX, method leads to inconsistent estimators. Nevertheless,the direct SIMEX method can reduce bias relative to a naiveestimator. We further propose a permutation SIMEX method thatleads to consistent estimators in theory. The performance ofboth the SIMEX methods depends on approximations to the exactextrapolants. Simulations show that both the SIMEX methods performbetter than ignoring measurement error. The methodology is illustratedusing microarray data from colon cancer patients. 相似文献
7.
Jay Odenbaugh 《Biology & philosophy》2005,20(2-3):231-255
Ecologists attempt to understand the diversity of life with mathematical models. Often, mathematical models contain simplifying
idealizations designed to cope with the blooming, buzzing confusion of the natural world. This strategy frequently issues
in models whose predictions are inaccurate. Critics of theoretical ecology argue that only predictively accurate models are
successful and contribute to the applied work of conservation biologists. Hence, they think that much of the mathematical
work of ecologists is poor science. Against this view, I argue that model building is successful even when models are predictively
inaccurate for at least three reasons: models allow scientists to explore the possible behaviors of ecological systems; models
give scientists simplified means by which they can investigate more complex systems by determining how the more complex system
deviates from the simpler model; and models give scientists conceptual frameworks through which they can conduct experiments
and fieldwork. Critics often mistake the purposes of model building, and once we recognize this, we can see their complaints
are unjustified. Even though models in ecology are not always accurate in their assumptions and predictions, they still contribute
to successful science. 相似文献
8.
Human error analysis is certainly a challenge today for all involved in safety and environmental risk assessment. The risk assessment process should not ignore the role of humans in accidental events and the consequences that may derive from human error. This article presents a case study of the Success Likelihood Index Method (SLIM) applied to the Electric Power Company of Serbia (EPCS), with the aim to disclose the importance of human error analysis in risk assessment. A database on work-related injuries, accidents, and critical interventions that occurred over a 10-year period in the EPCS provided the basis for this study. The research comprised analysis of 1074 workplaces, with a total of 3997 employees. A detailed analysis identified 10 typical human errors, performance shaping factors (PSFs), and estimated human error probability (HEP). Based on the obtained research results one can conclude that PSF control remains crucial for human error reduction, and thus prevention of occupational injuries and fatalities (the number of injuries decreased from 58 in 2012 to 44 in 2013, no fatalities recorded). Furthermore, the case study performed at the EPCS confirmed that the SLIM is highly applicable for quantification of human errors, comprehensive, and easy to perform. 相似文献
9.
10.
Synonymous codon usage in related species may differ as a result of variation in mutation biases, differences in the overall
strength and efficiency of selection, and shifts in codon preference—the selective hierarchy of codons within and between
amino acids. We have developed a maximum-likelihood method to employ explicit population genetic models to analyze the evolution
of parameters determining codon usage. The method is applied to twofold degenerate amino acids in 50 orthologous genes from
D. melanogaster and D. virilis. We find that D. virilis has significantly reduced selection on codon usage for all amino acids, but the data are incompatible with a simple model
in which there is a single difference in the long-term N
e, or overall strength of selection, between the two species, indicating shifts in codon preference. The strength of selection
acting on codon usage in D. melanogaster is estimated to be |N
e
s|≈ 0.4 for most CT-ending twofold degenerate amino acids, but 1.7 times greater for cysteine and 1.4 times greater for AG-ending
codons. In D. virilis, the strength of selection acting on codon usage for most amino acids is only half that acting in D. melanogaster but is considerably greater than half for cysteine, perhaps indicating the dual selection pressures of translational efficiency
and accuracy. Selection coefficients in orthologues are highly correlated (ρ= 0.46), but a number of genes deviate significantly
from this relationship.
Received: 20 December 1998 / Accepted: 17 February 1999 相似文献
11.
N -methyl- d -aspartate (NMDA) glutamate receptors play crucial roles in neuronal synaptic plasticity, learning and memory. However, as to whether different NMDA subunits are implicated in specific forms of memory is unclear. Moreover, nothing is known about the interspecific genetic variability of the GRIN2A subunit and how this variation can potentially explain evolutionary changes in behavioral phenotypes. Here, we used 28 primate GRIN2A sequences and various proxies of memory across primates to investigate the role of GRIN2A . Codon-specific sequence analysis on these sequences showed that GRIN2A in primates coevolved with a likely ecological proxy of spatial memory (relative home-range size) but not with other indices of non-spatial learning and memory such as social memory and social learning. Models based on gene averages failed to detect positive selection in primate branches with major changes in relative home-range size. This implies that accelerated evolution is concentrated in specific parts of the protein expressed by GRIN2A . Overall, our molecular evolution study, the first on GRIN2A , supports the notion that different NMDA subunits may play a role in specific forms of memory and that phenotypic diversity along with genetic evolution can be used to investigate the link between genes and behavior across evolutionary time. 相似文献
12.
Influences of error distributions of net ecosystem exchange on parameter estimation of a process-based terrestrial model: A case of broad-leaved Korean pine mixed forest in Changbaishan, China 下载免费PDF全文
Model predictions can be improved by parameter estimation from measurements. It was assumed that measurement errors of net ecosystem exchange (NEE) of CO2 follow a normal distribution. However, recent studies have shown that errors in eddy covariance measurements closely follow a double exponential distribution. In this paper, we compared effects of different distributions of measurement errors of NEE data on parameter estimation. NEE measurements in the Changbaishan forest were assimilated into a process-based terrestrial ecosystem model. We used the Markov chain Monte Carlo method to derive probability density functions of estimated parameters. Our results showed that modeled annual total gross primary production (GPP) and ecosystem respiration (Re) using the normal error distribution were higher than those using the double exponential distribution by 61–86 gC m?2 a?1 and 107–116 gC m?2 a?1, respectively. As a result, modeled annual sum of NEE using the normal error distribution was lower by 29–47 gC m?2 a?1 than that using the double exponential error distribution. Especially, modeled daily NEE based on the normal distribution underestimated the strong carbon sink in the Changbaishan forest in the growing season. We concluded that types of measurement error distributions and corresponding cost functions can substantially influence the estimation of parameters and carbon fluxes. 相似文献
13.
Model-free model elimination: A new step in the model-free dynamic analysis of NMR relaxation data 总被引:2,自引:2,他引:0
Model-free analysis is a technique commonly used within the field of NMR spectroscopy to extract atomic resolution, interpretable dynamic information on multiple timescales from the R
1, R
2, and steady state NOE. Model-free approaches employ two disparate areas of data analysis, the discipline of mathematical optimisation, specifically the minimisation of a χ2 function, and the statistical field of model selection. By searching through a large number of model-free minimisations, which were setup using synthetic relaxation data whereby the true underlying dynamics is known, certain model-free models have been identified to, at times, fail. This has been characterised as either the internal correlation times, τ
e
, τ
f
, or τ
s
, or the global correlation time parameter, local τ
m
, heading towards infinity, the result being that the final parameter values are far from the true values. In a number of cases the minimised χ2 value of the failed model is significantly lower than that of all other models and, hence, will be the model which is chosen by model selection techniques. If these models are not removed prior to model selection the final model-free results could be far from the truth. By implementing a series of empirical rules involving inequalities these models can be specifically isolated and removed. Model-free analysis should therefore consist of three distinct steps: model-free minimisation, model-free model elimination, and finally model-free model selection. Failure has also been identified to affect the individual Monte Carlo simulations used within error analysis. Each simulation involves an independent randomised relaxation data set and model-free minimisation, thus simulations suffer from exactly the same types of failure as model-free models. Therefore, to prevent these outliers from causing a significant overestimation of the errors the failed Monte Carlo simulations need to be culled prior to calculating the parameter standard deviations. 相似文献