首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The problem of determining an optimal phylogenetic tree from a set of data is an example of the Steiner problem in graphs. There is no efficient algorithm for solving this problem with reasonably large data sets. In the present paper an approach is described that proves in some cases that a given tree is optimal without testing all possible trees. The method first uses a previously described heuristic algorithm to find a tree of relatively small total length. The second part of the method independently analyses subsets of sites to determine a lower bound on the length of any tree. We simultaneously attempt to reduce the total length of the tree and increase the lower bound. When these are equal it is not possible to make a shorter tree with a given data set and given criterion. An example is given where the only two possible minimal trees are found for twelve different mammalian cytochrome c sequences. The criterion of finding the smallest number of minimum base changes was used. However, there is no general method of guaranteeing that a solution will be found in all cases and in particular better methods of improving the estimate of the lower bound need to be developed.  相似文献   

2.
MOTIVATION: Proteins play a crucial role in biological activity, so much can be learned from measuring protein expression and post-translational modification quantitatively. The reverse-phase protein lysate arrays allow us to quantify the relative expression levels of a protein in many different cellular samples simultaneously. Existing approaches to quantify protein arrays use parametric response curves fit to dilution series data. The results can be biased when the parametric function does not fit the data. RESULTS: We propose a non-parametric approach which adapts to any monotone response curve. The non-parametric approach is shown to be promising via both simulation and real data studies; it reduces the bias due to model misspecification and protects against outliers in the data. The non-parametric approach enables more reliable quantification of protein lysate arrays. AVAILABILITY: Code to implement the proposed method in the statistical package R is available at: http://odin.mdacc.tmc.edu/jhu/lysatearray-analysis/  相似文献   

3.
The aim of dose finding studies is sometimes to estimate parameters in a fitted model. The precision of the parameter estimates should be as high as possible. This can be obtained by increasing the number of subjects in the study, N, choosing a good and efficient estimation approach, and by designing the dose finding study in an optimal way. Increasing the number of subjects is not always feasible because of increasing cost, time limitations, etc. In this paper, we assume fixed N and consider estimation approaches and study designs for multiresponse dose finding studies. We work with diabetes dose–response data and compare a system estimation approach that fits a multiresponse Emax model to the data to equation‐by‐equation estimation that fits uniresponse Emax models to the data. We then derive some optimal designs for estimating the parameters in the multi‐ and uniresponse Emax model and study the efficiency of these designs.  相似文献   

4.
《Journal of Physiology》2009,103(6):348-352
The inference of interaction structures in multidimensional time series is a major challenge not only in neuroscience but in many fields of research. To gather information about the connectivity in a network from measured data, several parametric as well as non-parametric approaches have been proposed and widely examined. Today a lot of interest is focused on the evolution of the network connectivity in time which might contain information about ongoing tasks in the brain or possible dynamic dysfunctions. Therefore an extension of the current approaches towards time-resolved analysis techniques is desired. We present a parametric approach for time variant analysis, test its performance for simulated data, and apply it to real-world data.  相似文献   

5.
This paper presents a synergistic parametric and non-parametric modeling study of short-term plasticity (STP) in the Schaffer collateral to hippocampal CA1 pyramidal neuron (SC) synapse. Parametric models in the form of sets of differential and algebraic equations have been proposed on the basis of the current understanding of biological mechanisms active within the system. Non-parametric Poisson–Volterra models are obtained herein from broadband experimental input–output data. The non-parametric model is shown to provide better prediction of the experimental output than a parametric model with a single set of facilitation/depression (FD) process. The parametric model is then validated in terms of its input–output transformational properties using the non-parametric model since the latter constitutes a canonical and more complete representation of the synaptic nonlinear dynamics. Furthermore, discrepancies between the experimentally-derived non-parametric model and the equivalent non-parametric model of the parametric model suggest the presence of multiple FD processes in the SC synapses. Inclusion of an additional set of FD process in the parametric model makes it replicate better the characteristics of the experimentally-derived non-parametric model. This improved parametric model in turn provides the requisite biological interpretability that the non-parametric model lacks.  相似文献   

6.
In the analysis of longitudinal data, before assuming a parametric model, an idea of the shape of the variance and correlation functions for both the genetic and environmental parts should be known. When a small number of observations is available for each subject at a fixed set of times, it is possible to estimate unstructured covariance matrices, but not when the number of observations over time is large and when individuals are not measured at all times. The non-parametric approach, based on the variogram, presented by Diggle & Verbyla (1998), is specially adapted for exploratory analysis of such data. This paper presents a generalization of their approach to genetic analyses. The methodology is applied to daily records for milk production in dairy cattle and data on age-specific fertility in Drosophila.  相似文献   

7.
As an approach to combining the phase II dose finding trial and phase III pivotal trials, we propose a two-stage adaptive design that selects the best among several treatments in the first stage and tests significance of the selected treatment in the second stage. The approach controls the type I error defined as the probability of selecting a treatment and claiming its significance when the selected treatment is indifferent from placebo, as considered in Bischoff and Miller (2005). Our approach uses the conditional error function and allows determining the conditional type I error function for the second stage based on information observed at the first stage in a similar way to that for an ordinary adaptive design without treatment selection. We examine properties such as expected sample size and stage-2 power of this design with a given type I error and a maximum stage-2 sample size under different hypothesis configurations. We also propose a method to find the optimal conditional error function of a simple parametric form to improve the performance of the design and have derived optimal designs under some hypothesis configurations. Application of this approach is illustrated by a hypothetical example.  相似文献   

8.
We introduce a new optimal design for dose finding with a continuous efficacy endpoint. This design is studied in the context of a flexible model for the mean of the dose-response. The design incorporates aspects of both D- and c-optimality and can be used when the study goals under consideration include dose-response estimation, followed by identification of the target dose. Different optimality criteria are considered. Simulations are shown with results comparing our adaptive design to the fixed allocation (without adaptations). We show that both the estimation of dose-response and identification of the minimum effective dose are improved using our design.  相似文献   

9.
The classic algorithms of Needleman-Wunsch and Smith-Waterman find a maximum a posteriori probability alignment for a pair hidden Markov model (PHMM). To process large genomes that have undergone complex genome rearrangements, almost all existing whole genome alignment methods apply fast heuristics to divide genomes into small pieces that are suitable for Needleman-Wunsch alignment. In these alignment methods, it is standard practice to fix the parameters and to produce a single alignment for subsequent analysis by biologists. As the number of alignment programs applied on a whole genome scale continues to increase, so does the disagreement in their results. The alignments produced by different programs vary greatly, especially in non-coding regions of eukaryotic genomes where the biologically correct alignment is hard to find. Parametric alignment is one possible remedy. This methodology resolves the issue of robustness to changes in parameters by finding all optimal alignments for all possible parameters in a PHMM. Our main result is the construction of a whole genome parametric alignment of Drosophila melanogaster and Drosophila pseudoobscura. This alignment draws on existing heuristics for dividing whole genomes into small pieces for alignment, and it relies on advances we have made in computing convex polytopes that allow us to parametrically align non-coding regions using biologically realistic models. We demonstrate the utility of our parametric alignment for biological inference by showing that cis-regulatory elements are more conserved between Drosophila melanogaster and Drosophila pseudoobscura than previously thought. We also show how whole genome parametric alignment can be used to quantitatively assess the dependence of branch length estimates on alignment parameters.  相似文献   

10.
MOTIVATION: Finding common patterns, or motifs, in the promoter regions of co-expressed genes is an important problem in bioinformatics. A common representation of the motif is by probability matrix or PSSM (position specific scoring matrix). However, even for a motif of length six or seven, there is no algorithm that can guarantee finding the exact optimal matrix from an infinite number of possible matrices. RESULTS: This paper introduces the first algorithm, called EOMM, for finding the exact optimal matrix-represented motif, or simply optimal motif. Based on branch-and-bound searching by partitioning the solution space recursively, EOMM can find the optimal motif of size up to eight or nine, and a motif of larger size with any desired accuracy on the principle that the smaller the error bound, the longer the running time. Experiments show that for some real and simulated data sets, EOMM finds the motif despite very weak signals when existing software, such as MEME and MITRA-PSSM, fails to do so.  相似文献   

11.
Species dispersal studies provide valuable information in biological research. Restricted dispersal may give rise to a non-random distribution of genotypes in space. Detection of spatial genetic structure may therefore provide valuable insight into dispersal. Spatial structure has been treated via autocorrelation analysis with several univariate statistics for which results could dependent on sampling designs. New geostatistical approaches (variogram-based analysis) have been proposed to overcome this problem. However, modelling parametric variograms could be difficult in practice. We introduce a non-parametric variogram-based method for autocorrelation analysis between DNA samples that have been genotyped by means of multilocus-multiallele molecular markers. The method addresses two important aspects of fine-scale spatial genetic analyses: the identification of a non-random distribution of genotypes in space, and the estimation of the magnitude of any non-random structure. The method uses a plot of the squared Euclidean genetic distances vs. spatial distances between pairs of DNA-samples as empirical variogram. The underlying spatial trend in the plot is fitted by a non-parametric smoothing (LOESS, Local Regression). Finally, the predicted LOESS values are explained by segmented regressions (SR) to obtain classical spatial values such as the extent of autocorrelation. For illustration we use multivariate and single-locus genetic distances calculated from a microsatellite data set for which autocorrelation was previously reported. The LOESS/SR method produced a good fit providing similar value of published autocorrelation for this data. The fit by LOESS/SR was simpler to obtain than the parametric analysis since initial parameter values are not required during the trend estimation process. The LOESS/SR method offers a new alternative for spatial analysis.  相似文献   

12.
MOTIVATION: Estimation of misclassification error has received increasing attention in clinical diagnosis and bioinformatics studies, especially in small sample studies with microarray data. Current error estimation methods are not satisfactory because they either have large variability (such as leave-one-out cross-validation) or large bias (such as resubstitution and leave-one-out bootstrap). While small sample size remains one of the key features of costly clinical investigations or of microarray studies that have limited resources in funding, time and tissue materials, accurate and easy-to-implement error estimation methods for small samples are desirable and will be beneficial. RESULTS: A bootstrap cross-validation method is studied. It achieves accurate error estimation through a simple procedure with bootstrap resampling and only costs computer CPU time. Simulation studies and applications to microarray data demonstrate that it performs consistently better than its competitors. This method possesses several attractive properties: (1) it is implemented through a simple procedure; (2) it performs well for small samples with sample size, as small as 16; (3) it is not restricted to any particular classification rules and thus applies to many parametric or non-parametric methods.  相似文献   

13.
In studies of human balance, it is common to fit stimulus-response data by tuning the time-delay and gain parameters of a simple delayed feedback model. Many interpret this fitted model, a simple delayed feedback model, as evidence that predictive processes are not required to explain existing data on standing balance. However, two questions lead us to doubt this approach. First, does fitting a delayed feedback model lead to reliable estimates of the time-delay? Second, can a non-predictive controller provide an explanation compatible with the independently estimated time delay? For methodological and experimental clarity, we study human balancing of a simulated inverted pendulum via joystick and screen. A two-step approach to data analysis is used: firstly a non-parametric model—the closed-loop impulse response—is estimated from the experimental data; second, a parametric model is fitted to the non-parametric impulse-response by adjusting time-delay and controller parameters. To support the second step, a new explicit formula relating controller parameters to closed-loop impulse response is derived. Two classes of controller are investigated within a common state-space context: non-predictive and predictive. It is found that the time-delay estimate arising from the second step is strongly dependent on which controller class is assumed; in particular, the non-predictive control assumption leads to time-delay estimates that are smaller than those arising from the predictive assumption. Moreover, the time-delays estimated using the non-predictive control assumption are not consistent with a lower-bound on the time-delay of the non-parametric model whereas the corresponding predictive result is consistent. Thus while the goodness of fit only marginally favoured predictive over non-predictive control, if we add the additional constraint that the model must reproduce the non-parametric time delay, then the non-predictive control model fails. We conclude (1) the time-delay should be estimated independently of fitting a low order parametric model, (2) that balance of the simulated inverted pendulum could not be explained by the non-predictive control model and (3) that predictive control provided a better explanation than non-predictive control.  相似文献   

14.
Threshold dose/concentration values, such as the lowest effective dose, minimum effective dose or the lowest effective concentration (LED, MED or LEC, respectively) are in use as an alternative to the mutagen potency measures based on the 'rate' measurements (e.g., the slope of the initial part of the dose-response curve). In this respect, several statistical procedures for the corresponding so-called 'dose finding' were proposed during the last decades. However, most of them disregard the discrete nature of responses such as the plate colony count in the Ames Salmonella assay. When the plate counts agree with the Poisson assumption, two procedures considered here seem to be appropriate for the dose finding. One is based on the stepwise collapsing of the homogeneous control and dose counts; another consists of constructing the confidence limits for the mutation induction factor (MIF). When the dose and control counts are non-overlapping, the simple 'visual' non-parametric estimation of LED is possible. Applicability and validity of the methods is demonstrated with the two data sets on the mutagenicity of the beta-carboline alkaloid, harmine, and one of the oxidation products of apomorphine.  相似文献   

15.
Two methods for single-trial analysis were compared, an established parametric template approach and a recently proposed non-parametric method based on complex bandpass filtering. The comparison was carried out by means of pseudo-real simulations based on magnetoencephalography measurements of cortical responses to auditory signals. The comparison focused on amplitude and latency estimation of the M100 response. The results show that both methods are well suited for single-trial analysis of the auditory evoked M100. While both methods performed similarly with respect to latency estimation, the non-parametric approach was observed to be more robust for amplitude estimation. The non-parametric approach can thus be recommended as an additional valuable tool for single-trial analysis.  相似文献   

16.
O'Quigley J 《Biometrics》2005,61(3):749-756
The continual reassessment method (CRM) is a dose-finding design using a dynamic sequential updating scheme. In common with other dynamic schemes the method estimates a current dose level corresponding to some target percentile for experimentation. The estimate is based on all included subjects. This continual reevaluation is made possible by the use of a simple model. As it stands, neither the CRM, nor any of the other dynamic schemes, allow for the correct estimation of some target percentile, based on retrospective data apart from the exceptional situation in which the simplified model exactly generates the observations. In this article we focus on the very specific issue of retrospective analysis of data generated by some arbitrary mechanism and subsequently analyzed via the continual reassessment method. We show how this can be done consistently. The proposed methodology is not restricted to that particular design and is applicable to any sequential updating scheme in which dose levels are associated with percentiles via model inversion.  相似文献   

17.
Many biological data sets, from field observations and manipulative experiments, involve crossed factor designs, analysed in a univariate context by higher-way analyses of variance which partition out ‘main’ and ‘interaction’ effects. Indeed, tests for significance of interactions among factors, such as differing Before-After responses at Control and Impact sites, are the basis of the widely used BACI strategy for detecting impacts in the environment. There are difficulties, however, in generalising simple univariate definitions of interaction, from classic linear models, to the robust, non-parametric multivariate methods that are commonly required in handling assemblage data. The size of an interaction term, and even its existence at all, depends crucially on the measurement scale, so it is fundamentally a parametric construct. Despite this, certain forms of interaction can be examined using non-parametric methods, namely those evidenced by changing assemblage patterns over many time periods, for replicate sites from different experimental conditions (types of ‘Beyond BACI’ design) - or changing multivariate structure over space, at many observed times. Second-stage MDS, which can be thought of as an MDS plot of the pairwise similarities between MDS plots (e.g. of assemblage time trajectories), can be used to illustrate such interactions, and they can be formally tested by second-stage ANOSIM permutation tests. Similarities between (first-stage) multivariate patterns are assessed by rank-based matrix correlations, preserving the fully non-parametric approach common in marine community studies. The method is exemplified using time-series data on corals from Thailand, macrobenthos from Tees Bay, UK, and macroalgae from a complex recolonisation experiment carried out in the Ligurian Sea, Italy. The latter data set is also used to demonstrate how the analysis copes straightforwardly with certain repeated-measures designs.  相似文献   

18.
The coancestry coefficient, also known as the population structure parameter, is of great interest in population genetics. It can be thought of as the intraclass correlation of pairs of alleles within populations and it can serve as a measure of genetic distance between populations. For a general class of evolutionary models it determines the distribution of allele frequencies among populations. Under more restrictive models it can be regarded as the probability of identity by descent of any pair of alleles at a locus within a random mating population. In this paper we review estimation procedures that use the method of moments or are maximum likelihood under the assumption of normally distributed allele frequencies. We then consider the problem of testing hypotheses about this parameter. In addition to parametric and non-parametric bootstrap tests we present an asymptotically-distributed chi-square test. This test reduces to the contingency-table test for equal sample sizes across populations. Our new test appears to be more powerful than previous tests, especially for loci with multiple alleles. We apply our methods to HapMap SNP data to confirm that the coancestry coefficient for humans is strictly positive.  相似文献   

19.
A good understanding and characterization of the dose response relationship of any new compound is an important and ubiquitous problem in many areas of scientific investigation. This is especially true in the context of pharmaceutical drug development, where it is mandatory to launch safe drugs which demonstrate a clinically relevant effect. Selecting a dose too high may result in unacceptable safety problems, while selecting a dose too low may lead to ineffective drugs. Dose finding studies thus play a key role in any drug development program and are often the gate-keeper for large confirmatory studies. In this overview paper we focus on definitive and confirmatory dose finding studies in Phase II or III, reviewing relevant statistical design and analysis methods. In particular, we describe multiple comparison procedures, modeling approaches, and hybrid methods combining the advantages of both. An outlook to adaptive dose finding methods is also given. We use a real data example to illustrate the methods, together with a brief overview of relevant software.  相似文献   

20.
In clinical trials examining the incidence of pneumonia it is a common practice to measure infection via both invasive and non-invasive procedures. In the context of a recently completed randomized trial comparing two treatments the invasive procedure was only utilized in certain scenarios due to the added risk involved, and given that the level of the non-invasive procedure surpassed a given threshold. Hence, what was observed was bivariate data with a pattern of missingness in the invasive variable dependent upon the value of the observed non-invasive observation within a given pair. In order to compare two treatments with bivariate observed data exhibiting this pattern of missingness we developed a semi-parametric methodology utilizing the density-based empirical likelihood approach in order to provide a non-parametric approximation to Neyman-Pearson-type test statistics. This novel empirical likelihood approach has both a parametric and non-parametric components. The non-parametric component utilizes the observations for the non-missing cases, while the parametric component is utilized to tackle the case where observations are missing with respect to the invasive variable. The method is illustrated through its application to the actual data obtained in the pneumonia study and is shown to be an efficient and practical method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号