首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Transitive inference has been historically touted as a hallmark of human cognition. However, the ability of non‐human animals to perform this type of inference is being increasingly investigated. Experimentally, three main methods are commonly used to evaluate transitivity in animals: those that investigate social dominance relationships, the n‐term task series and the less well known associative transitivity task. Here, we revisit the question of what exactly constitutes transitive inference based upon a formal and habitual definition and propose two essential criteria for experimentally testing it in animals. We then apply these criteria to evaluate the existing body of work on this fundamental aspect of cognition using exemplars. Our evaluation reveals that some methods rely heavily on salient assumptions that are both questionable and almost impossible to verify in order to make a claim of transitive inference in animals. For example, we found shortcomings with most n‐term task designs in that they often do not provide an explicit transitive relationship and/or and ordered set on which transitive inference can be performed. Consequently, they rely on supplementary assumptions to make a claim of transitive inference. However, as these assumptions are either impossible or are extremely difficult to validate in non‐human animals, the results obtained using these specific n‐term tasks cannot be taken as unambiguous demonstrations (or the lack thereof) of transitive inference. This realisation is one that is generally overlooked in the literature. In contrast, the associative transitivity task and the dominance relationship test both meet the criteria for transitive inference. However, although the dominance relationship test can disambiguate between transitive inference accounts and associative ones, the associative transitivity test cannot. Our evaluation also highlights the limitations and future challenges of current associative models of transitive inference. We propose three new experimental methods that can be applied within any theoretical framework to ensure that the experimental behaviour observed is indeed the result of transitive inference whilst removing the need for supplementary assumptions: the test for the opposite transitive relation, the discrimination test between two separate and previously non‐reinforced items, and the control for absolute knowledge.  相似文献   

2.
Cells and bacteria growing in culture are subject to mutation, and as this mutation is the ultimate substrate for selection and evolution, the factors controlling the mutation rate are of some interest. The mutational event is not observed directly, but is inferred from the phenotype of the original mutant or of its descendants; the rate of mutation is inferred from the number of such mutant phenotypes. Such inference presumes a knowledge of the probability distribution for the size of a clone arising from a single mutation. We develop a mathematical formulation that assists in the design and analysis of experiments which investigate mutation rates and mutant clone size distribution, and we use it to analyse data for which the classical Luria-Delbrück clone-size distribution must be rejected.  相似文献   

3.
This paper discusses multivariate interval-censored failure time data that occur when there exist several correlated survival times of interest and only interval-censored data are available for each survival time. Such data occur in many fields. One is tumorigenicity experiments, which usually concern different types of tumors, tumors occurring in different locations of animals, or together. For regression analysis of such data, we develop a marginal inference approach using the additive hazards model and apply it to a set of bivariate interval-censored data arising from a tumorigenicity experiment. Simulation studies are conducted for the evaluation of the presented approach and suggest that the approach performs well for practical situations.  相似文献   

4.
To estimate an overall treatment difference with data from a randomized comparative clinical study, baseline covariates are often utilized to increase the estimation precision. Using the standard analysis of covariance technique for making inferences about such an average treatment difference may not be appropriate, especially when the fitted model is nonlinear. On the other hand, the novel augmentation procedure recently studied, for example, by Zhang and others (2008. Improving efficiency of inferences in randomized clinical trials using auxiliary covariates. Biometrics 64, 707-715) is quite flexible. However, in general, it is not clear how to select covariates for augmentation effectively. An overly adjusted estimator may inflate the variance and in some cases be biased. Furthermore, the results from the standard inference procedure by ignoring the sampling variation from the variable selection process may not be valid. In this paper, we first propose an estimation procedure, which augments the simple treatment contrast estimator directly with covariates. The new proposal is asymptotically equivalent to the aforementioned augmentation method. To select covariates, we utilize the standard lasso procedure. Furthermore, to make valid inference from the resulting lasso-type estimator, a cross validation method is used. The validity of the new proposal is justified theoretically and empirically. We illustrate the procedure extensively with a well-known primary biliary cirrhosis clinical trial data set.  相似文献   

5.
Doubly robust estimation in missing data and causal inference models   总被引:3,自引:0,他引:3  
Bang H  Robins JM 《Biometrics》2005,61(4):962-973
The goal of this article is to construct doubly robust (DR) estimators in ignorable missing data and causal inference models. In a missing data model, an estimator is DR if it remains consistent when either (but not necessarily both) a model for the missingness mechanism or a model for the distribution of the complete data is correctly specified. Because with observational data one can never be sure that either a missingness model or a complete data model is correct, perhaps the best that can be hoped for is to find a DR estimator. DR estimators, in contrast to standard likelihood-based or (nonaugmented) inverse probability-weighted estimators, give the analyst two chances, instead of only one, to make a valid inference. In a causal inference model, an estimator is DR if it remains consistent when either a model for the treatment assignment mechanism or a model for the distribution of the counterfactual data is correctly specified. Because with observational data one can never be sure that a model for the treatment assignment mechanism or a model for the counterfactual data is correct, inference based on DR estimators should improve upon previous approaches. Indeed, we present the results of simulation studies which demonstrate that the finite sample performance of DR estimators is as impressive as theory would predict. The proposed method is applied to a cardiovascular clinical trial.  相似文献   

6.
Summary We examine situations where interest lies in the conditional association between outcome and exposure variables, given potential confounding variables. Concern arises that some potential confounders may not be measured accurately, whereas others may not be measured at all. Some form of sensitivity analysis might be employed, to assess how this limitation in available data impacts inference. A Bayesian approach to sensitivity analysis is straightforward in concept: a prior distribution is formed to encapsulate plausible relationships between unobserved and observed variables, and posterior inference about the conditional exposure–disease relationship then follows. In practice, though, it can be challenging to form such a prior distribution in both a realistic and simple manner. Moreover, it can be difficult to develop an attendant Markov chain Monte Carlo (MCMC) algorithm that will work effectively on a posterior distribution arising from a highly nonidentified model. In this article, a simple prior distribution for acknowledging both poorly measured and unmeasured confounding variables is developed. It requires that only a small number of hyperparameters be set by the user. Moreover, a particular computational approach for posterior inference is developed, because application of MCMC in a standard manner is seen to be ineffective in this problem.  相似文献   

7.
Estimation of the number of species at spatial scales too large to census directly is a longstanding ecological challenge. A recent comprehensive census of tropical arthropods and trees in Panama provides a unique opportunity to apply an inference procedure for up-scaling species richness and thereby make progress toward that goal. Confidence in the underlying theory is first established by showing that the method accurately predicts the species abundance distribution for trees and arthropods, and in particular accurately captures the rare tail of the observed distributions. The rare tail is emphasized because the shape of the species-area relationship is especially influenced by the numbers of rare species. The inference procedure is then applied to estimate the total number of arthropod and tree species at spatial scales ranging from a 6000 ha forest reserve to all of Panama, with input data only from censuses in 0.04 ha plots. The analysis suggests that at the scale of the reserve there are roughly twice as many arthropod species as previously estimated. For the entirety of Panama, inferred tree species richness agrees with an accepted empirical estimate, while inferred arthropod species richness is significantly below a previous published estimate that has been criticized as too high. An extension of the procedure to estimate species richness at continental scale is proposed.  相似文献   

8.
Gianola D  van Kaam JB 《Genetics》2008,178(4):2289-2303
Reproducing kernel Hilbert spaces regression procedures for prediction of total genetic value for quantitative traits, which make use of phenotypic and genomic data simultaneously, are discussed from a theoretical perspective. It is argued that a nonparametric treatment may be needed for capturing the multiple and complex interactions potentially arising in whole-genome models, i.e., those based on thousands of single-nucleotide polymorphism (SNP) markers. After a review of reproducing kernel Hilbert spaces regression, it is shown that the statistical specification admits a standard mixed-effects linear model representation, with smoothing parameters treated as variance components. Models for capturing different forms of interaction, e.g., chromosome-specific, are presented. Implementations can be carried out using software for likelihood-based or Bayesian inference.  相似文献   

9.
D B Rubin 《Biometrics》1991,47(4):1213-1234
Causal inference in an important topic and one that is now attracting serious attention of statisticians. Although there exist recent discussions concerning the general definition of causal effects and a substantial literature on specific techniques for the analysis of data in randomized and nonrandomized studies, there has been relatively little discussion of modes of statistical inference for causal effects. This presentation briefly describes and contrasts four basic modes of statistical inference for causal effects, emphasizes the common underlying causal framework with a posited assignment mechanism, and describes practical implications in the context of an example involving the effects of switching from a name-band to a generic drug. A fundamental conclusion is that in such nonrandomized studies, sensitivity of inference to the assignment mechanism is the dominant issue, and it cannot be avoided by changing modes of inference, for instance, by changing from randomization-based to Bayesian methods.  相似文献   

10.
MOTIVATION: The problem of identifying victims in a mass disaster using DNA fingerprints involves a scale of computation that requires efficient and accurate algorithms. In a typical scenario there are hundreds of samples taken from remains that must be matched to the pedigrees of the alleged victim's surviving relatives. Moreover the samples are often degraded due to heat and exposure. To develop a competent method for this type of forensic inference problem, the complicated quality issues of DNA typing need to be handled appropriately, the matches between every sample and every family must be considered, and the confidence of matches need to be provided. RESULTS: We present a unified probabilistic framework that efficiently clusters samples, conservatively eliminates implausible sample-pedigree pairings, and handles both degraded samples (missing values) and experimental errors in producing and/or reading a genotype. We present a method that confidently exclude forensically unambiguous sample-family matches from the large hypothesis space of candidate matches, based on posterior probabilistic inference. Due to the high confidentiality of disaster DNA data, simulation experiments are commonly performed and used here for validation. Our framework is shown to be robust to these errors at levels typical in real applications. Furthermore, the flexibility in the probabilistic models makes it possible to extend this framework to include other biological factors such as interdependent markers, mitochondrial sequences, and blood type. AVAILABILITY: The software and data sets are available from the authors upon request.  相似文献   

11.
12.
A method to denoise single-molecule fluorescence resonance energy (smFRET) trajectories using wavelet detail thresholding and Bayesian inference is presented. Bayesian methods are developed to identify fluorophore photoblinks in the time trajectories. Simulated data are used to quantify the improvement in static and dynamic data analysis. Application of the method to experimental smFRET data shows that it distinguishes photoblinks from large shifts in smFRET efficiency while maintaining the important advantage of an unbiased approach. Known sources of experimental noise are examined and quantified as a means to remove their contributions via soft thresholding of wavelet coefficients. A wavelet decomposition algorithm is described, and thresholds are produced through the knowledge of noise parameters in the discrete-time photon signals. Reconstruction of the signals from thresholded coefficients produces signals that contain noise arising only from unquantifiable parameters. The method is applied to simulated and observed smFRET data, and it is found that the denoised data retain their underlying dynamic properties, but with increased resolution.  相似文献   

13.
Protein identifications, instead of peptide-spectrum matches, constitute the biologically relevant result of shotgun proteomics studies. How to appropriately infer and report protein identifications has triggered a still ongoing debate. This debate has so far suffered from the lack of appropriate performance measures that allow us to objectively assess protein inference approaches. This study describes an intuitive, generic and yet formal performance measure and demonstrates how it enables experimentalists to select an optimal protein inference strategy for a given collection of fragment ion spectra. We applied the performance measure to systematically explore the benefit of excluding possibly unreliable protein identifications, such as single-hit wonders. Therefore, we defined a family of protein inference engines by extending a simple inference engine by thousands of pruning variants, each excluding a different specified set of possibly unreliable identifications. We benchmarked these protein inference engines on several data sets representing different proteomes and mass spectrometry platforms. Optimally performing inference engines retained all high confidence spectral evidence, without posterior exclusion of any type of protein identifications. Despite the diversity of studied data sets consistently supporting this rule, other data sets might behave differently. In order to ensure maximal reliable proteome coverage for data sets arising in other studies we advocate abstaining from rigid protein inference rules, such as exclusion of single-hit wonders, and instead consider several protein inference approaches and assess these with respect to the presented performance measure in the specific application context.  相似文献   

14.
Studies of social networks provide unique opportunities to assess the causal effects of interventions that may impact more of the population than just those intervened on directly. Such effects are sometimes called peer or spillover effects, and may exist in the presence of interference, that is, when one individual's treatment affects another individual's outcome. Randomization-based inference (RI) methods provide a theoretical basis for causal inference in randomized studies, even in the presence of interference. In this article, we consider RI of the intervention effect in the eX-FLU trial, a randomized study designed to assess the effect of a social distancing intervention on influenza-like-illness transmission in a connected network of college students. The approach considered enables inference about the effect of the social distancing intervention on the per-contact probability of influenza-like-illness transmission in the observed network. The methods allow for interference between connected individuals and for heterogeneous treatment effects. The proposed methods are evaluated empirically via simulation studies, and then applied to data from the eX-FLU trial.  相似文献   

15.
Biophysical models are increasingly used for medical applications at the organ scale. However, model predictions are rarely associated with a confidence measure although there are important sources of uncertainty in computational physiology methods. For instance, the sparsity and noise of the clinical data used to adjust the model parameters (personalization), and the difficulty in modeling accurately soft tissue physiology. The recent theoretical progresses in stochastic models make their use computationally tractable, but there is still a challenge in estimating patient-specific parameters with such models. In this work we propose an efficient Bayesian inference method for model personalization using polynomial chaos and compressed sensing. This method makes Bayesian inference feasible in real 3D modeling problems. We demonstrate our method on cardiac electrophysiology. We first present validation results on synthetic data, then we apply the proposed method to clinical data. We demonstrate how this can help in quantifying the impact of the data characteristics on the personalization (and thus prediction) results. Described method can be beneficial for the clinical use of personalized models as it explicitly takes into account the uncertainties on the data and the model parameters while still enabling simulations that can be used to optimize treatment. Such uncertainty handling can be pivotal for the proper use of modeling as a clinical tool, because there is a crucial requirement to know the confidence one can have in personalized models.  相似文献   

16.
Binomial regression models are commonly applied to proportion data such as those relating to the mortality and infection rates of diseases. However, it is often the case that the responses may exhibit excessive zeros; in such cases a zero‐inflated binomial (ZIB) regression model can be applied instead. In practice, it is essential to test if there are excessive zeros in the outcome to help choose an appropriate model. The binomial models can yield biased inference if there are excessive zeros, while ZIB models may be unnecessarily complex and hard to interpret, and even face convergence issues, if there are no excessive zeros. In this paper, we develop a new test for testing zero inflation in binomial regression models by directly comparing the amount of observed zeros with what would be expected under the binomial regression model. A closed form of the test statistic, as well as the asymptotic properties of the test, is derived based on estimating equations. Our systematic simulation studies show that the new test performs very well in most cases, and outperforms the classical Wald, likelihood ratio, and score tests, especially in controlling type I errors. Two real data examples are also included for illustrative purpose.  相似文献   

17.
Most models and algorithms developed to perform statistical inference from DNA data make the assumption that substitution processes affecting distinct nucleotide sites are stochastically independent. This assumption ensures both mathematical and computational tractability but is in disagreement with observed data in many situations--one well-known example being CpG dinucleotide hypermutability in mammalian genomes. In this paper, we consider the class of RN95 + YpR substitution models, which allows neighbor-dependent effects--including CpG hypermutability--to be taken into account, through transitions between pyrimidine-purine dinucleotides. We show that it is possible to adapt inference methods originally developed under the assumption of independence between sites to RN95 + YpR models, using a mathematically rigorous framework provided by specific structural properties of this class of models. We assess how efficient this approach is at inferring the CpG hypermutability rate from aligned DNA sequences. The method is tested on simulated data and compared against several alternatives; the results suggest that it delivers a high degree of accuracy at a low computational cost. We then apply our method to an alignment of 10 DNA sequences from primate species. Model comparisons within the RN95 + YpR class show the importance of taking into account neighbor-dependent effects. An application of the method to the detection of hypomethylated islands is discussed.  相似文献   

18.
The existence of haplotype blocks transmitted from parents to offspring has been suggested recently. This has created an interest in the inference of the block structure and length. The motivation is that haplotype blocks that are characterized well will make it relatively easier to quickly map all the genes carrying human diseases. To study the inference of haplotype block systematically, we propose a statistical framework. In this framework, the optimal haplotype block partitioning is formulated as the problem of statistical model selection; missing data can be handled in a standard statistical way; population strata can be implemented; block structure inference/hypothesis testing can be performed; prior knowledge, if present, can be incorporated to perform a Bayesian inference. The algorithm is linear in the number of loci, instead of NP-hard for many such algorithms. We illustrate the applications of our method to both simulated and real data sets.  相似文献   

19.
Inferring haplotype data from genotype data is a crucial step in linking SNPs to human diseases. Given n genotypes over m SNP sites, the haplotype inference (HI) problem deals with finding a set of haplotypes so that each given genotype can be formed by a combining a pair of haplotypes from the set. The perfect phylogeny haplotyping (PPH) problem is one of the many computational approaches to the HI problem. Though it was conjectured that the complexity of the PPH problem was O(nm), the complexity of all the solutions presented until recently was O(nm (2)). In this paper, we make complete use of the column-ordering that was presented earlier and show that there must be some interdependencies among the pairwise relationships between SNP sites in order for the given genotypes to allow a perfect phylogeny. Based on these interdependencies, we introduce the FlexTree (flexible tree) data structure that represents all the pairwise relationships in O(m) space. The FlexTree data structure provides a compact representation of all the perfect phylogenies for the given set of genotypes. We also introduce an ordering of the genotypes that allows the genotypes to be added to the FlexTree sequentially. The column ordering, the FlexTree data structure, and the row ordering we introduce make the O(nm) OPPH algorithm possible. We present some results on simulated data which demonstrate that the OPPH algorithm performs quiet impressively when compared to the previous algorithms. The OPPH algorithm is one of the first O(nm) algorithms presented for the PPH problem.  相似文献   

20.
To investigate the effect of cell-to-cell variation in store-operated calcium entry (SOCE) on the evaluation of data from stable cell clones selected following gene transfection, we measured SOCE in 2700 individual HEK-293 cells from the parent population and in 1900 individual cells from a clonal subpopulation of HEK-293 cells. We applied statistical resampling techniques to model conditions where one would compare the average SOCE in n control clones to the average SOCE in n experimental clones (n = 1-200). For an overexpression experiment with n = 1, there is a 27% chance of observing a 100% or higher difference in SOCE between clones, with n = 10 there is a 34% probability of observing a 20% or greater difference in SOCE, and with n = 100, there is less than a 10% chance of seeing a 10% or greater difference in SOCE, based solely on random selection of clones from the parent HEK-293 cell population. To assure that the degree of cell-to-cell variation was predictive of the degree of clone-to-clone variation, we measured SOCE in 270 clones, each arising from a single cell, and found the variation to be very similar to that observed for individual cells.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号