期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Mutation parameters from DNA sequence data using graph theoretic measures on lineage trees

Magori-Cohen R Louzoun Y Kleinstein SH 《Bioinformatics (Oxford, England)》2006,22(14):e332-e340

MOTIVATION: B cells responding to antigenic stimulation can fine-tune their binding properties through a process of affinity maturation composed of somatic hypermutation, affinity-selection and clonal expansion. The mutation rate of the B cell receptor DNA sequence, and the effect of these mutations on affinity and specificity, are of critical importance for understanding immune and autoimmune processes. Unbiased estimates of these properties are currently lacking due to the short time-scales involved and the small numbers of sequences available. RESULTS: We have developed a bioinformatic method based on a maximum likelihood analysis of phylogenetic lineage trees to estimate the parameters of a B cell clonal expansion model, which includes somatic hypermutation with the possibility of lethal mutations. Lineage trees are created from clonally related B cell receptor DNA sequences. Important links between tree shapes and underlying model parameters are identified using mutual information. Parameters are estimated using a likelihood function based on the joint distribution of several tree shapes, without requiring a priori knowledge of the number of generations in the clone (which is not available for rapidly dividing populations in vivo). A systematic validation on synthetic trees produced by a mutating birth-death process simulation shows that our estimates are precise and robust to several underlying assumptions. These methods are applied to experimental data from autoimmune mice to demonstrate the existence of hypermutating B cells in an unexpected location in the spleen. 相似文献

2.

Semiparametric maximum likelihood for measurement error model regression

Schafer DW 《Biometrics》2001,57(1):53-61

This paper presents an EM algorithm for semiparametric likelihood analysis of linear, generalized linear, and nonlinear regression models with measurement errors in explanatory variables. A structural model is used in which probability distributions are specified for (a) the response and (b) the measurement error. A distribution is also assumed for the true explanatory variable but is left unspecified and is estimated by nonparametric maximum likelihood. For various types of extra information about the measurement error distribution, the proposed algorithm makes use of available routines that would be appropriate for likelihood analysis of (a) and (b) if the true x were available. Simulations suggest that the semiparametric maximum likelihood estimator retains a high degree of efficiency relative to the structural maximum likelihood estimator based on correct distributional assumptions and can outperform maximum likelihood based on an incorrect distributional assumption. The approach is illustrated on three examples with a variety of structures and types of extra information about the measurement error distribution. 相似文献

3.

Prioritizing candidate peptides for cancer vaccines through predicting peptide presentation by HLA-I proteins

Laura Y. Zhou Fei Zou Wei Sun 《Biometrics》2023,79(3):2664-2676

Cancer (treatment) vaccines that are made of neoantigens, or peptides unique to tumor cells due to somatic mutations, have emerged as a promising method to reinvigorate the immune response against cancer. A key step to prioritizing neoantigens for cancer vaccines is computationally predicting which neoantigens are presented on the cell surface by a human leukocyte antigen (HLA). We propose to address this challenge by training a neural network using mass spectrometry (MS) data composed of peptides presented by at least one of several HLAs of a subject. We embed the neural network within a mixture model and train the neural network by maximizing the likelihood of the mixture model. After evaluating our method using data sets where the peptide presentation status was known, we applied it to analyze somatic mutations of 60 melanoma patients and identified a group of neoantigens more immunogenic in tumor cells than in normal cells. Moreover, neoantigen burden estimated by our method was significantly associated with a measurement of the immune system activity, suggesting these neoantigens could induce an immune response. 相似文献

4.

Multilevel models for survival analysis with random effects

Yau KK 《Biometrics》2001,57(1):96-102

A method for modeling survival data with multilevel clustering is described. The Cox partial likelihood is incorporated into the generalized linear mixed model (GLMM) methodology. Parameter estimation is achieved by maximizing a log likelihood analogous to the likelihood associated with the best linear unbiased prediction (BLUP) at the initial step of estimation and is extended to obtain residual maximum likelihood (REML) estimators of the variance component. Estimating equations for a three-level hierarchical survival model are developed in detail, and such a model is applied to analyze a set of chronic granulomatous disease (CGD) data on recurrent infections as an illustration with both hospital and patient effects being considered as random. Only the latter gives a significant contribution. A simulation study is carried out to evaluate the performance of the REML estimators. Further extension of the estimation procedure to models with an arbitrary number of levels is also discussed. 相似文献

5.

多重二元响应Probit模型的渐近有效估计

马建军徐兴忠《生物数学学报》2008,23(4):677-686

针对多重二元响应Probit模型提出了两步估计方法,第一步由边际似然得到参数√n相合的估计,第二步通过一步迭代得到渐近有效估计,由于只需一步迭代,因此在利用模拟方法计算信息阵时,可以增加模拟的次数,从而减少模拟所产生的扰动对估计的影响．相似文献

6.

The parameter identification problem for the somatic shunt model

John A. White Paul B. Manis Eric D. Young 《Biological cybernetics》1992,66(4):307-318

The somatic shunt model, a generalized version of the Rall equivalent cylinder model, is used commonly to describe the passive electrotonic properties of neurons. Procedures for determining the parameters of the somatic shunt model that best describe a given neuron typically rely on the response of the cell to a small step of hyperpolarizing current injected by an intrasomatic recording electrode. In this study it is shown that the problem of estimating model parameters for the somatic shunt model using physiological data is ill-posed, in that very small errors in measured data can lead to large and unpredictable errors in parameter estimates. If the somatic shunt is assumed to be a real property of the intact neuron, the effects of these errors are not severe when predicting EPSP waveshapes resulting from synaptic input at a given location. However, if the somatic shunt is assumed to be a consequence of a leakage pathway around the recording electrode, and a correction for the shunt is applied, then the instability of the inverse problem can introduce large errors in estimates of EPSP waveshape as a function of synaptic location in the intact cell. Morphological constraints can be used to improve the accuracy of the inversion procedure in terms of both parameter estimates and predicted EPSP responses. 相似文献

7.

An asymptotic theory for model selection inference in general semiparametric problems 总被引：2，自引：0，他引：2

Claeskens Gerda; Carroll Raymond J. 《Biometrika》2007,94(2):249-265

Hjort & Claeskens (2003) developed an asymptotic theoryfor model selection, model averaging and subsequent inferenceusing likelihood methods in parametric models, along with associatedconfidence statements. In this article, we consider a semiparametricversion of this problem, wherein the likelihood depends on parametersand an unknown function, and model selection/averaging is tobe applied to the parametric parts of the model. We show thatall the results of Hjort & Claeskens hold in the semiparametriccontext, if the Fisher information matrix for parametric modelsis replaced by the semiparametric information bound for semiparametricmodels, and if maximum likelihood estimators for parametricmodels are replaced by semiparametric efficient profile estimators.Our methods of proof employ Le Cam's contiguity lemmas, leadingto transparent results. The results also describe the behaviourof semiparametric model estimators when the parametric componentis misspecified, and also have implications for pointwise-consistentmodel selectors. 相似文献

8.

The proteome profile of embryogenic cell suspensions of Coffea arabica L.

下载免费PDF全文

Nádia A. Campos Luciano V. Paiva Bart Panis Sebastien C. Carpentier 《Proteomics》2016,16(6):1001-1005

相似文献

9.

Statistical aspects of genetic mapping in autopolyploids. 总被引：8，自引：0，他引：8

M I Ripol G A Churchill J A da Silva M Sorrells 《Gene》1999,235(1-2):31-41

Many plant species of agriculture importance are polyploid, having more than two copies of each chromosome per cell. In this paper, we describe statistical methods for genetic map construction in autopolyploid species with particular reference to the use of molecular markers. The first step is to determine the dosage of each DNA fragment (electrophoretic band) from its segregation ratio. Fragments present in a single dose can be used to construct framework maps for individual chromosomes. Fragments present in multiple doses can often be used to link the single chromosome maps into homologous groups and provide additional ordering information. Marker phenotype probabilities were calculated for pairs of markers arranged in different configurations among the homologous chromosomes. These probabilities were used to compute a maximum likelihood estimator of the recombination fraction between pairs of markers. A likelihood ratio test for linkage of multidose markers was derived. The information provided by each configuration and power and sample size considerations are also discussed. A set of 294 RFLP markers scored on 90 plants of the species Saccharum spontaneum L. was used to illustrate the construction of an autopolyploid map. Previous studies conducted on the same data revealed that this species of sugar cane is an autooctaploid with 64 chromosomes arranged into eight homologous groups. The methodology described permitted consolidation of 54 linkage groups into ten homologous groups. 相似文献

10.

A Bayesian Decision Procedure for Selecting Prognostic Variables Associated with Survival for Data in which Censoring is Prevalent

Martin D. Fraser Alfred A. Bartolucci William A. Smith Karen P. Singh 《Biometrical journal. Biometrische Zeitschrift》1995,37(4):463-479

A Bayesian procedure is developed for the selection of concomitant variables in survival models. The variables are selected in a step-up procedure according to the criterion of maximum expected likelihood, where the expectation is over the prior parameter space. Prior knowledge of the influence of these covariates on patient prognosis is incorporated into the analysis. The step-up procedure is stopped when the Bayes factor in favor of omitting the variable selected in a particular step exceeds a specified value. The resulting model with the selected variables is fitted using Bayes estimates of the coefficients. This technique is applied to Hodgkin's disease data from a large Cooperative Clinical Trial Group and the results are compared to the results from the classical likelihood selection procedure. 相似文献

11.

Information dynamics in living systems: prokaryotes, eukaryotes, and cancer

Frieden BR Gatenby RA 《PloS one》2011,6(7):e22085

Background

Living systems use information and energy to maintain stable entropy while far from thermodynamic equilibrium. The underlying first principles have not been established.

Findings

We propose that stable entropy in living systems, in the absence of thermodynamic equilibrium, requires an information extremum (maximum or minimum), which is invariant to first order perturbations. Proliferation and death represent key feedback mechanisms that promote stability even in a non-equilibrium state. A system moves to low or high information depending on its energy status, as the benefit of information in maintaining and increasing order is balanced against its energy cost. Prokaryotes, which lack specialized energy-producing organelles (mitochondria), are energy-limited and constrained to an information minimum. Acquisition of mitochondria is viewed as a critical evolutionary step that, by allowing eukaryotes to achieve a sufficiently high energy state, permitted a phase transition to an information maximum. This state, in contrast to the prokaryote minima, allowed evolution of complex, multicellular organisms. A special case is a malignant cell, which is modeled as a phase transition from a maximum to minimum information state. The minimum leads to a predicted power-law governing the in situ growth that is confirmed by studies measuring growth of small breast cancers.

Conclusions

We find living systems achieve a stable entropic state by maintaining an extreme level of information. The evolutionary divergence of prokaryotes and eukaryotes resulted from acquisition of specialized energy organelles that allowed transition from information minima to maxima, respectively. Carcinogenesis represents a reverse transition: of an information maximum to minimum. The progressive information loss is evident in accumulating mutations, disordered morphology, and functional decline characteristics of human cancers. The findings suggest energy restriction is a critical first step that triggers the genetic mutations that drive somatic evolution of the malignant phenotype. 相似文献

12.

On the near-singularity of models for animal recovery data

Catchpole EA Kgosi PM Morgan BJ 《Biometrics》2001,57(3):720-726

相似文献

13.

Inference in MCMC step selection models

Théo Michelot Paul G. Blackwell Simon Chamaillé-Jammes Jason Matthiopoulos 《Biometrics》2020,76(2):438-447

Habitat selection models are used in ecology to link the spatial distribution of animals to environmental covariates and identify preferred habitats. The most widely used models of this type, resource selection functions, aim to capture the steady-state distribution of space use of the animal, but they assume independence between the observed locations of an animal. This is unrealistic when location data display temporal autocorrelation. The alternative approach of step selection functions embed habitat selection in a model of animal movement, to account for the autocorrelation. However, inferences from step selection functions depend on the underlying movement model, and they do not readily predict steady-state space use. We suggest an analogy between parameter updates and target distributions in Markov chain Monte Carlo (MCMC) algorithms, and step selection and steady-state distributions in movement ecology, leading to a step selection model with an explicit steady-state distribution. In this framework, we explain how maximum likelihood estimation can be used for simultaneous inference about movement and habitat selection. We describe the local Gibbs sampler, a novel rejection-free MCMC scheme, use it as the basis of a flexible class of animal movement models, and derive its likelihood function for several important special cases. In a simulation study, we verify that maximum likelihood estimation can recover all model parameters. We illustrate the application of the method with data from a zebra. 相似文献

14.

A Stochastic Regression Model for General Trend Analysis of Longitudinal Continuous Data

Wei‐Hsiung Chao Su‐Hua Chen 《Biometrical journal. Biometrische Zeitschrift》2009,51(4):571-587

A predictive continuous time model is developed for continuous panel data to assess the effect of time‐varying covariates on the general direction of the movement of a continuous response that fluctuates over time. This is accomplished by reparameterizing the infinitesimal mean of an Ornstein–Uhlenbeck processes in terms of its equilibrium mean and a drift parameter, which assesses the rate that the process reverts to its equilibrium mean. The equilibrium mean is modeled as a linear predictor of covariates. This model can be viewed as a continuous time first‐order autoregressive regression model with time‐varying lag effects of covariates and the response, which is more appropriate for unequally spaced panel data than its discrete time analog. Both maximum likelihood and quasi‐likelihood approaches are considered for estimating the model parameters and their performances are compared through simulation studies. The simpler quasi‐likelihood approach is suggested because it yields an estimator that is of high efficiency relative to the maximum likelihood estimator and it yields a variance estimator that is robust to the diffusion assumption of the model. To illustrate the proposed model, an application to diastolic blood pressure data from a follow‐up study on cardiovascular diseases is presented. Missing observations are handled naturally with this model. 相似文献

15.

Parametric proportional hazards model for mapping genomic imprinting of survival traits

Huijiang Gao Yongxin Liu Tingting Zhang Runqing Yang Daniel R. Prows 《Journal of applied genetics》2013,54(1):79-88

A number of imprinted genes have been observed in plants, animals and humans. They not only control growth and developmental traits, but may also be responsible for survival traits. Based on the Cox proportional hazards (PH) model, we constructed a general parametric model for dissecting genomic imprinting, in which a baseline hazard function is selectable for fitting the effects of imprinted quantitative trait loci (iQTL) genotypes on the survival curve. The expectation–maximisation (EM) algorithm is derived for solving the maximum likelihood estimates of iQTL parameters. The imprinting patterns of the detected iQTL are statistically tested under a series of null hypotheses. The Bayesian information criterion (BIC) model selection criterion is employed to choose an optimal baseline hazard function with maximum likelihood and parsimonious parameterisation. We applied the proposed approach to analyse the published data in an F₂ population of mice and concluded that, among five commonly used survival distributions, the log-logistic distribution is the optimal baseline hazard function for the survival time of hyperoxic acute lung injury (HALI). Under this optimal model, five QTL were detected, among which four are imprinted in different imprinting patterns. 相似文献

16.

Landscape genetic inferences vary with sampling scenario for a pond‐breeding amphibian

Travis Seaborn Samantha S. Hauser Lauren Konrade Lisette P. Waits Caren S. Goldberg 《Ecology and evolution》2019,9(9):5063-5078

A critical decision in landscape genetic studies is whether to use individuals or populations as the sampling unit. This decision affects the time and cost of sampling and may affect ecological inference. We analyzed 334 Columbia spotted frogs at 8 microsatellite loci across 40 sites in northern Idaho to determine how inferences from landscape genetic analyses would vary with sampling design. At all sites, we compared a proportion available sampling scheme (PASS), in which all samples were used, to resampled datasets of 2–11 individuals. Additionally, we compared a population sampling scheme (PSS) to an individual sampling scheme (ISS) at 18 sites with sufficient sample size. We applied an information theoretic approach with both restricted maximum likelihood and maximum likelihood estimation to evaluate competing landscape resistance hypotheses. We found that PSS supported low‐density forest when restricted maximum likelihood was used, but a combination model of most variables when maximum likelihood was used. We also saw variations when AIC was used compared to BIC. ISS supported this model as well as additional models when testing hypotheses of land cover types that create the greatest resistance to gene flow for Columbia spotted frogs. Increased sampling density and study extent, seen by comparing PSS to PASS, showed a change in model support. As number of individuals increased, model support converged at 7–9 individuals for ISS to PSS. ISS may be useful to increase study extent and sampling density, but may lack power to provide strong support for the correct model with microsatellite datasets. Our results highlight the importance of additional research on sampling design effects on landscape genetics inference. 相似文献

17.

Shrinkage Pre-Test Estimator of the Intercept Parameter for a Regression Model with Multivariate Student-t Errors

Shahjahan Khan A. K. Md. Ehsanes Saleh 《Biometrical journal. Biometrische Zeitschrift》1997,39(2):131-147

In the presence of an uncertain prior information about the value of the slope parameter, the estimation of the intercept parameter of a simple regression model with a multivariate Student-t error distribution is investigated. The unrestricted, restricted and shrinkage preliminary test maximum likelihood estimators are defined. The expressions for the bias and the mean square error of the three estimators are provided and the relative efficiences are analyzed. A maximin criterion is established, and graphs are constructed for an arbitrary number of degrees of freedom (D.F.) as well as sample sizes. A criterion to select optimal significance level is also discussed. 相似文献

18.

Structural inference in transition measurement error models for longitudinal data

Pan W Lin X Zeng D 《Biometrics》2006,62(2):402-412

We propose a new class of models, transition measurement error models, to study the effects of covariates and the past responses on the current response in longitudinal studies when one of the covariates is measured with error. We show that the response variable conditional on the error-prone covariate follows a complex transition mixed effects model. The naive model obtained by ignoring the measurement error correctly specifies the transition part of the model, but misspecifies the covariate effect structure and ignores the random effects. We next study the asymptotic bias in naive estimator obtained by ignoring the measurement error for both continuous and discrete outcomes. We show that the naive estimator of the regression coefficient of the error-prone covariate is attenuated, while the naive estimators of the regression coefficients of the past responses are generally inflated. We then develop a structural modeling approach for parameter estimation using the maximum likelihood estimation method. In view of the multidimensional integration required by full maximum likelihood estimation, an EM algorithm is developed to calculate maximum likelihood estimators, in which Monte Carlo simulations are used to evaluate the conditional expectations in the E-step. We evaluate the performance of the proposed method through a simulation study and apply it to a longitudinal social support study for elderly women with heart disease. An additional simulation study shows that the Bayesian information criterion (BIC) performs well in choosing the correct transition orders of the models. 相似文献

19.

Identification of a multihit model for nonhomogeneous cell population]

L V Pavlova L G Khanin A Iu Iakovlev 《Radiobiologiia》1992,32(6):785-787

A generalized multihit-multitarget model for a nonhomogeneous, with respect to radiosensitivity, population of irradiated cells is presented. The least squares and the maximum likelihood estimation of the model parameters is given. The estimates quality is evaluated by the computer-based study. The results obtained show the possibility of the parametric identification of cell radiosensitivity distribution according to the "dose-response" data. 相似文献

20.

Upper bounds on maximum likelihood for phylogenetic trees

Hendy MD Holland BR 《Bioinformatics (Oxford, England)》2003,19(Z2):ii66-ii72

We introduce a mechanism for analytically deriving upper bounds on the maximum likelihood for genetic sequence data on sets of phylogenies. A simple 'partition' bound is introduced for general models. Tighter bounds are developed for the simplest model of evolution, the two state symmetric model of nucleotide substitution under the molecular clock. This follows earlier theoretical work which has been restricted to this model by analytic complexity. A weakness of current numerical computation is that reported 'maximum likelihood' results cannot be guaranteed, both for a specified tree (because of the possibility of multiple maxima) or over the full tree space (as the computation is intractable for large sets of trees). The bounds we develop here can be used to conclusively eliminate large proportions of tree space in the search for the maximum likelihood tree. This is vital in the development of a branch and bound search strategy for identifying the maximum likelihood tree. We report the results from a simulation study of approximately 10(6) data sets generated on clock-like trees of five leaves. In each trial a likelihood value of one specific instance of a parameterised tree is compared to the bound determined for each of the 105 possible rooted binary trees. The proportion of trees that are eliminated from the search for the maximum likelihood tree ranged from 92% to almost 98%, indicating a computational speed-up factor of between 12 and 44. 相似文献