首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Kolassa JE  Tanner MA 《Biometrics》1999,55(4):1291-1294
This article presents an algorithm for small-sample conditional confidence regions for two or more parameters for any discrete regression model in the generalized linear interactive model family. Regions are constructed by careful inversion of conditional hypothesis tests. This method presupposes the use of approximate or exact techniques for enumerating the sample space for some components of the vector of sufficient statistics conditional on other components. Such enumeration may be performed exactly or by exact or approximate Monte Carlo, including the algorithms of Kolassa and Tanner (1994, Journal of the American Statistical Association 89, 697-702; 1999, Biometrics 55, 246-251). This method also assumes that one can compute certain conditional probabilities for a fixed value of the parameter vector. Because of a property of exponential families, one can use this set of conditional probabilities to directly compute the conditional probabilities associated with any other value of the vector of the parameters of interest. This observation dramatically reduces the computational effort required to invert the hypothesis test to obtain the confidence region. To construct a region with confidence level 1 - alpha, the algorithm begins with a grid of values for the parameters of interest. For each parameter vector on the grid (corresponding to the current null hypothesis), one transforms the initial set of conditional probabilities using exponential tilting and then calculates the p value for this current null hypothesis. The confidence region is the set of parameter values for which the p value is at least alpha.  相似文献   

2.
Use of runs statistics for pattern recognition in genomic DNA sequences.   总被引:2,自引:0,他引:2  
In this article, the use of the finite Markov chain imbedding (FMCI) technique to study patterns in DNA under a hidden Markov model (HMM) is introduced. With a vision of studying multiple runs-related statistics simultaneously under an HMM through the FMCI technique, this work establishes an investigation of a bivariate runs statistic under a binary HMM for DNA pattern recognition. An FMCI-based recursive algorithm is derived and implemented for the determination of the exact distribution of this bivariate runs statistic under an independent identically distributed (IID) framework, a Markov chain (MC) framework, and a binary HMM framework. With this algorithm, we have studied the distributions of the bivariate runs statistic under different binary HMM parameter sets; probabilistic profiles of runs are created and shown to be useful for trapping HMM maximum likelihood estimates (MLEs). This MLE-trapping scheme offers good initial estimates to jump-start the expectation-maximization (EM) algorithm in HMM parameter estimation and helps prevent the EM estimates from landing on a local maximum or a saddle point. Applications of the bivariate runs statistic and the probabilistic profiles in conjunction with binary HMMs for pattern recognition in genomic DNA sequences are illustrated via case studies on DNA bendability signals using human DNA data.  相似文献   

3.
MOTIVATION: Gene genealogies offer a powerful context for inferences about the evolutionary process based on presently segregating DNA variation. In many cases, it is the distribution of population parameters, marginalized over the effectively infinite-dimensional tree space, that is of interest. Our evolutionary forest (EF) algorithm uses Monte Carlo methods to generate posterior distributions of population parameters. A novel feature is the updating of parameter values based on a probability measure defined on an ensemble of histories (a forest of genealogies), rather than a single tree. RESULTS: The EF algorithm generates samples from the correct marginal distribution of population parameters. Applied to actual data from closely related fruit fly species, it rapidly converged to posterior distributions that closely approximated the exact posteriors generated through massive computational effort. Applied to simulated data, it generated credible intervals that covered the actual parameter values in accordance with the nominal probabilities. AVAILABILITY: A C++ implementation of this method is freely accessible at http://www.isds.duke.edu/~scl13  相似文献   

4.
We have developed a versatile computer program for optimization of ligand binding experiments (e.g., radioreceptor assay system for hormones, drugs, etc.). This optimization algorithm is based on an overall measure of precision of the parameter estimates (D-optimality). The program DESIGN uses an exact mathematical model of the equilibrium ligand binding system with up to two ligands binding to any number of classes of binding sites. The program produces a minimal list of the optimal ligand concentrations for use in the binding experiment. This potentially reduces the time and cost necessary to perform a binding experiment. The program allows comparison of any proposed experimental design with the D-optimal design or with assay protocols in current use. The level of nonspecific binding is regarded as an unknown parameter of the system, along with the affinity constant (Kd) and binding capacity (Bmax). Selected parameters can be fixed at constant values and thereby excluded from the optimization algorithm. Emphasis may be placed on improving the precision of a single parameter or on improving the precision of all the parameters simultaneously. We present optimal designs for several of the more commonly used assay protocols (saturation binding with a single labeled ligand, competition or displacement curve, one or two classes of binding sites), and evaluate the robustness of these designs to changes in parameter values of the underlying models. We also derive the theoretical D-optimal design for the saturation binding experiment with a homogeneous receptor class.  相似文献   

5.
The dynamic decay adjustment (DDA) algorithm is a fast constructive algorithm for training RBF neural networks (RBFNs) and probabilistic neural networks (PNNs). The algorithm has two parameters, namely, theta(+) and theta(-). The papers which introduced DDA argued that those parameters would not heavily influence classification performance and therefore they recommended using always the default values of these parameters. In contrast, this paper shows that smaller values of parameter theta(-) can, for a considerable number of datasets, result in strong improvement in generalization performance. The experiments described here were carried out using twenty benchmark classification datasets from both Proben1 and the UCI repositories. The results show that for eleven of the datasets, the parameter theta(-) strongly influenced classification performance. The influence of theta(-) was also noticeable, although much less, on six of the datasets considered. This paper also compares the performance of RBF-DDA with theta(-) selection with both AdaBoost and Support Vector Machines (SVMs).  相似文献   

6.
RNA sequences can form structures which are conserved throughout evolution and the question of aligning two RNA secondary structures has been extensively studied. Most of the previous alignment algorithms require the input of gap opening and gap extension penalty parameters. The choice of appropriate parameter values is controversial as there is little biological information to guide their assignment. In this paper, we present an algorithm which circumvents this problem. Instead of finding an optimal alignment with predefined gap opening penalty, the algorithm finds the optimal alignment with exact number of aligned blocks.  相似文献   

7.
Expressions for the partition function Q (T) of DNA hairpins are presented. Calculations of Q (T), in conjunction with our previously reported numerically exact algorithm [T. M. Paner, M. Amaratunga, M. J. Doktycz, and A. S. Benight (1990) Biopolymers, 29, 1715-1734], yield a numerical method to evaluate the temperature dependence of the transition enthalpy, entropy, and free energy of a DNA hairpin directly from its optical melting curve. No prior assumptions that the short hairpins melt in a two-state manner are required. This method is then applied in a systematic manner to investigate the stability of the six basepair duplex stem 5'-GGATAC-3' having four-base dangling single-strand ends with the sequences (XY)2, where X, Y = A, T, G, C, on the 5' end and a T4 loop on the 3' end. Results show that all dangling ends of the sample set stabilize the hairpin against melting. Increases in transition temperatures as great as 4.0 degrees C above the blunt-ended control hairpin were observed. The hierarchy of the hairpin transition temperatures is dictated by the identity of the first base of the dangling end adjoining the duplex in the order: purine greater than T greater than C. Calculated melting curves of every hairpin were fit to experimental curves by adjustment of a single parameter in the numerically exact theoretical algorithm. Exact fits were obtained in all cases. Experimental melting curves were also calculated assuming a two-state melting process. Equally accurate fits of all dangling-ended hairpin melting curves were obtained with the two-state model calculation. This was not the case for the melting curve of the blunt-ended hairpin, indicating the presence of a four-base dangling-end drives hairpin melting to a two-state process. Q (T) was calculated as a function of temperature for each hairpin using the theoretical parameters that provided calculated curves in exact agreement with the experimentally obtained optical melting curves. From Q (T), the temperature dependence of the transition enthalpy delta H, entropy delta S, and free energy delta G were calculated for every hairpin providing a quantitative assessment of the effects of dangling ends on hairpin thermodynamics. Comparisons of our results are made with those of the Breslauer group [M. Senior, R. A. Jones, and K. J. Breslauer (1988) Biochemistry 27, 3879-3885] on the T2 5' dangling-ended d(GC)3 duplexes.(ABSTRACT TRUNCATED AT 400 WORDS)  相似文献   

8.
The efficiency of simulation-based multiple comparisons   总被引:5,自引:0,他引:5  
D Edwards  J J Berry 《Biometrics》1987,43(4):913-928
A frequently encountered problem in practice is that of simultaneous interval estimation of p linear combinations of a parameter beta in the setting of (or equivalent to) a univariate linear model. This problem has been solved adequately only in a few settings when the covariance matrix of the estimator is diagonal; in other cases, conservative solutions can be obtained by the methods of Scheffé, Bonferroni, or Sidák (1967, Journal of the American Statistical Association 62, 626-633). Here we investigate the efficiency of using a simulated critical point for exact intervals, which has been suggested before but never put to serious test. We find the simulation-based method to be completely reliable and essentially exact. Sample size savings are substantial (in our settings): 3-19% over the Sidák method, 4-37% over the Bonferroni method, and 27-33% over the Scheffé method. We illustrate the efficiency and flexibility of the simulation-based method with case studies in physiology and marine ecology.  相似文献   

9.
The neurotoxicity of a substance is often tested using animal bioassays. In the functional observational battery, animals are exposed to a test agent and multiple outcomes are recorded to assess toxicity, using approximately 40 animals measured on up to 30 different items. This design gives rise to a challenging statistical problem: a large number of outcomes for a small sample of subjects. We propose an exact test for multiple binary outcomes, under the assumption that the correlation among these items is equal. This test is based upon an exponential model described by Molenberghs and Ryan (1999, Environmetrics 10, 279-300) and extends the methods developed by Corcoran et al. (2001, Biometrics 57, 941-948) who developed an exact test for exchangeably correlated binary data for groups (clusters) of correlated observations. We present a method that computes an exact p-value testing for a joint dose-response relationship. An estimate of the parameter for dose response is also determined along with its 95% confidence bound. The method is illustrated using data from a neurotoxicity bioassay for the chemical perchlorethylene.  相似文献   

10.
11.
The heart rate variability (HRV) spectral parameters are classically used for studying the autonomic nervous system, as they allow the evaluation of the balance between the sympathetic and parasympathetic influences on heart rhythm. However, this evaluation is usually based on fixed frequency regions, which does not allow possible variation, or is based on an adaptive individual time dependent spectral boundaries (ITSB) method sensitive to noisy environments. In order to overcome these difficulties, we propose the constrained Gaussian modeling (CGM) method that dynamically models the power spectrum as a two Gaussian shapes mixture. It appeared that this procedure was able to accurately follow the exact parameters in the case of simulated data, in comparison with a parameter estimation obtained with a rigid frequency cutting approach or with the ITSB algorithm. Real data results obtained on a classical stand-test and on the Fantasia database are also presented and discussed.  相似文献   

12.
13.
We describe an efficient algorithm for determining exactly the minimum number of sires consistent with the multi-locus genotypes of a mother and her progeny. We consider cases where a simple exhaustive search through all possible sets of sires is impossible in practice because it would take too long to complete. Our algorithm for solving this combinatorial optimization problem avoids visiting large parts of search space that would not result in a solution with fewer sires. This improvement is of particular importance when the number of allelic types in the progeny array is large and when the minimum number of sires is expected to be large. Precisely in such cases, it is important to know the minimum number of sires: this number gives an exact bound on the most likely number of sires estimated by a random search algorithm in a parameter region where it may be difficult to determine whether it has converged. We apply our algorithm to data from the marine snail, Littorina saxatilis.  相似文献   

14.
Aitkin M 《Biometrics》1999,55(1):117-128
This paper describes an EM algorithm for nonparametric maximum likelihood (ML) estimation in generalized linear models with variance component structure. The algorithm provides an alternative analysis to approximate MQL and PQL analyses (McGilchrist and Aisbett, 1991, Biometrical Journal 33, 131-141; Breslow and Clayton, 1993; Journal of the American Statistical Association 88, 9-25; McGilchrist, 1994, Journal of the Royal Statistical Society, Series B 56, 61-69; Goldstein, 1995, Multilevel Statistical Models) and to GEE analyses (Liang and Zeger, 1986, Biometrika 73, 13-22). The algorithm, first given by Hinde and Wood (1987, in Longitudinal Data Analysis, 110-126), is a generalization of that for random effect models for overdispersion in generalized linear models, described in Aitkin (1996, Statistics and Computing 6, 251-262). The algorithm is initially derived as a form of Gaussian quadrature assuming a normal mixing distribution, but with only slight variation it can be used for a completely unknown mixing distribution, giving a straightforward method for the fully nonparametric ML estimation of this distribution. This is of value because the ML estimates of the GLM parameters can be sensitive to the specification of a parametric form for the mixing distribution. The nonparametric analysis can be extended straightforwardly to general random parameter models, with full NPML estimation of the joint distribution of the random parameters. This can produce substantial computational saving compared with full numerical integration over a specified parametric distribution for the random parameters. A simple method is described for obtaining correct standard errors for parameter estimates when using the EM algorithm. Several examples are discussed involving simple variance component and longitudinal models, and small-area estimation.  相似文献   

15.
Testing of Hardy–Weinberg proportions (HWP) with asymptotic goodness-of-fit tests is problematic when the contingency table of observed genotype counts has sparse cells or the sample size is low, and exact procedures are to be preferred. Exact p-values can be (1) calculated via computational demanding enumeration methods or (2) approximated via simulation methods. Our objective was to develop a new algorithm for exact tests of HWP with multiple alleles on the basis of conditional probabilities of genotype arrays, which is faster than existing algorithms. We derived an algorithm for calculating the exact permutation significance value without enumerating all genotype arrays having the same allele counts as the observed one. The algorithm can be used for testing HWP by (1) summation of the conditional probabilities of occurrence of genotype arrays with smaller probability than the observed one, and (2) comparison of the sum with a nominal Type I error rate α. Application to published experimental data from seven maize populations showed that the exact test is computationally feasible and reduces the number of enumerated genotype count matrices about 30% compared with previously published algorithms.  相似文献   

16.
Mathematical modeling of complex gene expression programs is an emerging tool for understanding disease mechanisms. However, identification of large models sometimes requires training using qualitative, conflicting or even contradictory data sets. One strategy to address this challenge is to estimate experimentally constrained model ensembles using multiobjective optimization. In this study, we used Pareto Optimal Ensemble Techniques (POETs) to identify a family of proof-of-concept signal transduction models. POETs integrate Simulated Annealing (SA) with Pareto optimality to identify models near the optimal tradeoff surface between competing training objectives. We modeled a prototypical-signaling network using mass-action kinetics within an ordinary differential equation (ODE) framework (64 ODEs in total). The true model was used to generate synthetic immunoblots from which the POET algorithm identified the 117 unknown model parameters. POET generated an ensemble of signaling models, which collectively exhibited population-like behavior. For example, scaled gene expression levels were approximately normally distributed over the ensemble following the addition of extracellular ligand. Also, the ensemble recovered robust and fragile features of the true model, despite significant parameter uncertainty. Taken together, these results suggest that experimentally constrained model ensembles could capture qualitatively important network features without exact parameter information.  相似文献   

17.
Currently, linear mixed model analyses of expression microarray experiments are performed either in a gene-specific or global mode. The joint analysis provides more flexibility in terms of how parameters are fitted and estimated and tends to be more powerful than the gene-specific analysis. Here we show how to implement the gene-specific linear mixed model analysis as an exact algorithm for the joint linear mixed model analysis. The gene-specific algorithm is exact, when the mixed model equations can be partitioned into unrelated components: One for all global fixed and random effects and the others for the gene-specific fixed and random effects for each gene separately. This unrelatedness holds under three conditions: (1) any gene must have the same number of replicates or probes on all arrays, but these numbers can differ among genes; (2) the residual variance of the (transformed) expression data must be homogeneous or constant across genes (other variance components need not be homogeneous) and (3) the number of genes in the experiment is large. When these conditions are violated, the gene-specific algorithm is expected to be nearly exact.  相似文献   

18.
Assuming a lognormally distributed measure of bioavailability, individual bioequivalence is defined as originally proposed by Anderson and Hauck (1990) and Wellek (1990; 1993). For the posterior probability of the associated statistical hypothesis with respect to a noninformative reference prior, a numerically efficient algorithm is constructed which serves as the building block of a procedure for computing exact rejection probabilities of the Bayesian test under arbitrary parameter constellations. By means of this tool, the Bayesian test can be shown to maintain the significance level without being over‐conservative and to yield gains in power of up to 30% as compared to the distribution‐free procedure which gained some popularity under the name TIER. Moreover, it is shown that the Bayesian construction also allows scaling of the probability‐based criterion with respect to the proportion of subjects exhibiting bioequivalent responses to repeated administrations of the reference formulation of the drug under study.  相似文献   

19.
Consider a general linear model with p -dimensional parameter vector beta and i.i.d. normal errors. Let K(1), ..., K(k ), and L be linearly independent vectors of constants such that L(T)beta not equal 0. We describe exact simultaneous tests for hypotheses that Ki(T)beta/L(T)beta equal specified constants using one-sided and two-sided alternatives, and describe exact simultaneous confidence intervals for these ratios. In the case where the confidence set is a single bounded contiguous set, we describe what we claim are the best possible conservative simultaneous confidence intervals for these ratios - best in that they form the minimum k -dimensional hypercube enclosing the exact simultaneous confidence set. We show that in the case of k = 2, this "box" is defined by the minimum and maximum values for the two ratios in the simultaneous confidence set and that these values are obtained via one of two sources: either from the solutions to each of four systems of equations or at points along the boundary of the simultaneous confidence set where the correlation between two t variables is zero. We then verify that these intervals are narrower than those previously presented in the literature.  相似文献   

20.
We investigate 2d Ising spin glasses with binary couplings via exact computations of the partition function on lattices with periodic boundary conditions. After introducing the physical issues, we sketch the algorithm to compute the partition function as a polynomial with integer coefficients. This technique is then exploited to obtain the thermodynamic properties of the spin glass. We find an anomalous low temperature scaling of the heat capacity c(v) approximately e(-2beta) and that hyperscaling holds.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号