Here we focus on discrimination problems where the number of predictors substantially exceeds the sample size and we propose a Bayesian variable selection approach to multinomial probit models. Our method makes use of mixture priors and Markov chain Monte Carlo techniques to select sets of variables that differ among the classes. We apply our methodology to a problem in functional genomics using gene expression profiling data. The aim of the analysis is to identify molecular signatures that characterize two different stages of rheumatoid arthritis.  相似文献   

Summary Quantal bioassay experiments relate the amount or potency of some compound; for example, poison, antibody, or drug to a binary outcome such as death or infection in animals. For infectious diseases, probit regression is commonly used for inference and a key measure of potency is given by the IDP , the amount that results in P% of the animals being infected. In some experiments, a validation set may be used where both direct and proxy measures of the dose are available on a subset of animals with the proxy being available on all. The proxy variable can be viewed as a messy reflection of the direct variable, leading to an errors‐in‐variables problem. We develop a model for the validation set and use a constrained seemingly unrelated regression (SUR) model to obtain the distribution of the direct measure conditional on the proxy. We use the conditional distribution to derive a pseudo‐likelihood based on probit regression and use the parametric bootstrap for statistical inference. We re‐evaluate an old experiment in 21 monkeys where neutralizing antibodies (nABs) to HIV were measured using an old (proxy) assay in all monkeys and with a new (direct) assay in a validation set of 11 who had sufficient stored plasma. Using our methods, we obtain an estimate of the ID1 for the new assay, an important target for HIV vaccine candidates. In simulations, we compare the pseudo‐likelihood estimates with regression calibration and a full joint likelihood approach.  相似文献   

Insects use highly distributed nervous systems to process exteroception from head sensors, compare that information with state-based goals, and direct posture or locomotion toward those goals. To study how descending commands from brain centers produce coordinated, goal-directed motion in distributed nervous systems, we have constructed a conductance-based neural system for our robot MantisBot, a 29 degree-of-freedom, 13.3:1 scale praying mantis robot. Using the literature on mantis prey tracking and insect locomotion, we designed a hierarchical, distributed neural controller that establishes the goal, coordinates different joints, and executes prey-tracking motion. In our controller, brain networks perceive the location of prey and predict its future location, store this location in memory, and formulate descending commands for ballistic saccades like those seen in the animal. The descending commands are simple, indicating only 1) whether the robot should walk or stand still, and 2) the intended direction of motion. Each joint's controller uses the descending commands differently to alter sensory-motor interactions, changing the sensory pathways that coordinate the joints' central pattern generators into one cohesive motion. Experiments with one leg of MantisBot show that visual input produces simple descending commands that alter walking kinematics, change the walking direction in a predictable manner, enact reflex reversals when necessary, and can control both static posture and locomotion with the same network.  相似文献   

We propose a parametric regression model for the cumulative incidence functions (CIFs) commonly used for competing risks data. The model adopts a modified logistic model as the baseline CIF and a generalized odds‐rate model for covariate effects, and it explicitly takes into account the constraint that a subject with any given prognostic factors should eventually fail from one of the causes such that the asymptotes of the CIFs should add up to one. This constraint intrinsically holds in a nonparametric analysis without covariates, but is easily overlooked in a semiparametric or parametric regression setting. We hence model the CIF from the primary cause assuming the generalized odds‐rate transformation and the modified logistic function as the baseline CIF. Under the additivity constraint, the covariate effects on the competing cause are modeled by a function of the asymptote of the baseline distribution and the covariate effects on the primary cause. The inference procedure is straightforward by using the standard maximum likelihood theory. We demonstrate desirable finite‐sample performance of our model by simulation studies in comparison with existing methods. Its practical utility is illustrated in an analysis of a breast cancer dataset to assess the treatment effect of tamoxifen, adjusting for age and initial pathological tumor size, on breast cancer recurrence that is subject to dependent censoring by second primary cancers and deaths.  相似文献   

Atrial fibrillation (AF) is an abnormal heart rhythm characterized by rapid and irregular heartbeat, with or without perceivable symptoms. In clinical practice, the electrocardiogram (ECG) is often used for diagnosis of AF. Since the AF often arrives as recurrent episodes of varying frequency and duration and only the episodes that occur at the time of ECG can be detected, the AF is often underdiagnosed when a limited number of repeated ECGs are used. In studies evaluating the efficacy of AF ablation surgery, each patient undergoes multiple ECGs and the AF status at the time of ECG is recorded. The objective of this paper is to estimate the marginal proportions of patients with or without AF in a population, which are important measures of the efficacy of the treatment. The underdiagnosis problem is addressed by a three‐class mixture regression model in which a patient's probability of having no AF, paroxysmal AF, and permanent AF is modeled by auxiliary baseline covariates in a nested logistic regression. A binomial regression model is specified conditional on a subject being in the paroxysmal AF group. The model parameters are estimated by the Expectation‐Maximization (EM) algorithm. These parameters are themselves nuisance parameters for the purpose of this research, but the estimators of the marginal proportions of interest can be expressed as functions of the data and these nuisance parameters and their variances can be estimated by the sandwich method. We examine the performance of the proposed methodology in simulations and two real data applications.  相似文献   

The marginal regression model offers a useful alternative to conditional approaches to analyzing binary data (Liang, Zeger, and Qaqish, 1992, Journal of the Royal Statistical Society, Series B 54, 3-40). Instead of modelling the binary data directly as do Liang and Zeger (1986, Biometrika 73, 13-22), the parametric marginal regression model developed by Qu et al. (1992, Biometrics 48, 1095-1102) assumes that there is an underlying multivariate normal vector that gives rise to the observed correlated binary outcomes. Although this parametric approach provides a flexible way to model different within-cluster correlation structures and does not restrict the parameter space, it is of interest to know how robust the parameter estimates are with respect to choices of the latent distribution. We first extend the latent modelling to include multivariate t-distributed latent vectors and assess the robustness in this class of distributions. Then we show through a simulation that the parameter estimates are robust with respect to the latent distribution even if latent distribution is skewed. In addtion to this empirical evidence for robustness, we show through the iterative algorithm that the robustness of the regression coefficents with respect to misspecifications of covariance structure in Liang and Zeger's model in fact indicates robustness with respect to underlying distributional assumptions of the latent vector in the latent variable model.  相似文献   

For many diseases the infection status of individuals cannot be observed directly, but can only be inferred from biomarkers that are subject to measurement error. Diagnosis of infection based on observed symptoms can itself be regarded as an imperfect test of infection status. The temporal relationship between infection and marker outcomes may be complex, especially for recurrent diseases where individuals can experience multiple bouts of infection. We propose an approach that first models the unobserved longitudinal infection status of individuals conditional on relevant covariates, and then jointly models the longitudinal sequence of biomarker outcomes conditional on infection status and covariate information through time, thus resulting in a joint model for longitudinal infection and biomarker sequences. This model can be used to investigate the temporal dynamics of infection, and to evaluate the usefulness of biomarkers for monitoring purposes. Our work is motivated and illustrated by a longitudinal study of bovine digital dermatitis (BDD) on commercial dairy farms in North West England and North Wales, in which the infection of interest is Treponeme spp., and the biomarkers of interest are a continuous enzyme-linked immunosorbent assay test outcome and a dichotomous outcome, foot lesion status. BDD is known to be one of the possible causes of foot lesions in cows.  相似文献   

Successful pharmaceutical drug development requires finding correct doses. The issues that conventional dose‐response analyses consider, namely whether responses are related to doses, which doses have responses differing from a control dose response, the functional form of a dose‐response relationship, and the dose(s) to carry forward, do not need to be addressed simultaneously. Determining if a dose‐response relationship exists, regardless of its functional form, and then identifying a range of doses to study further may be a more efficient strategy. This article describes a novel estimation‐focused Bayesian approach (BMA‐Mod) for carrying out the analyses when the actual dose‐response function is unknown. Realizations from Bayesian analyses of linear, generalized linear, and nonlinear regression models that may include random effects and covariates other than dose are optimally combined to produce distributions of important secondary quantities, including test‐control differences, predictive distributions of possible outcomes from future trials, and ranges of doses corresponding to target outcomes. The objective is similar to the objective of the hypothesis‐testing based MCP‐Mod approach, but provides more model and distributional flexibility and does not require testing hypotheses or adjusting for multiple comparisons. A number of examples illustrate the application of the method.  相似文献   

Early‐onset torsion dystonia is a dominant motor disorder linked to mutations in torsinA. TorsinA is weakly related to a superfamily of chaperone‐like proteins. The function of the torsin group remains largely unknown. Here we use RNAi and over‐expression to analyze the function of torp4a, the only Drosophila torsin. Targeted down‐regulation in the eye causes progressive degeneration of the retina. Conversely, over‐expression of torp4a protects from age‐related degeneration. In the retinas of young animals, a correlation with the lysosome‐related organelle, the pigment granule, is also observed. Lowering torp4a causes an increase in pigment granules, while over‐expression causes loss of granules. We have performed a screen for genetic interactors of torp4a identifying a number mutants, including two members of the AP‐3 complex. Other genetic interactors found included genes related to actin and myosin function. Our findings implicate the Drosophila torsin, torp4a, to function with molecules consistent with already predicted roles in the endoplasmic reticulum/nuclear envelope compartment, and have identified potential new interactions with AP‐3 like components. © 2006 Wiley Periodicals, Inc. J Neurobiol, 2006  相似文献   

The inclusion body process route for manufacturing proteins offers distinct process advantages in terms of expression levels and the ease of initial inclusion body recovery. The efficiency of the refolding unit operation, however, does determine the overall economic feasibility of a process. Dilution refolding is the simplest and most extensively used refolding operation, although significant yield losses often occur due mainly to aggregation. Operating variables may have a significant effect on the degree of aggregation, but a systematic study has not been reported. This study investigates the effect of operating variables on the dilution refolding of solubilized r-trypsinogen inclusion bodies in a pulse-fed stirred reactor. Variables investigated were inclusion body washing, stirring speed, feed rate, concentration of solubilized r-trypsinogen, and concentration of urea during solubilization of the inclusion bodies. Additionally, the effect of baffles in the reactor was investigated. The yield of renatured r-trypsinogen varied between 12 +/- 0.2% and 21 +/- 1.0% depending on the specific combination of operating variables employed. It is clear that a suboptimal operating strategy can significantly reduce protein yield. In particular, we note that an increased intensity of mixing adversely affected yield in contrast to previous reports indicating that enhanced dispersion increases yield. We conclude that yield is determined not only by the efficiency of dispersion, but also by the local chemical environment of the protein as it folds, and the rate of change of this environment. This will be controlled by micromixing effects, and hence the intensity of agitation, in a complex manner requiring further characterization.  相似文献   

A better understanding of Chagas’ disease is important because the knowledge about the progression and the participation of the different types of cells in this disease are still lacking. To clarify this system, the kinetics of inflammatory cells and parasite nests was shown in an experiment. Using this experimental data, we have developed a three-dimensional multi-agent-based computational model for the evolution of Chagas’ disease. Our model includes five different types of agents: inflammatory cell, fibrosis, cardiomyocyte, fibroblast, and Trypanosoma cruzi. Fibrosis is fixed and the other types of agents can move through the empty space. They move randomly by using the Moore neighborhood. This model reproduces the acute and chronic phases of Chagas’ disease and the volume occupied by all different types of cells in the cardiac tissue.  相似文献   

Inhibition of protein synthesis by cycloheximide blocks subsequent division of a mammalian cell, but only if the cell is exposed to the drug before the "restriction point" (i.e. within the first several hours after birth). If exposed to cycloheximide after the restriction point, a cell proceeds with DNA synthesis, mitosis and cell division and halts in the next cell cycle. If cycloheximide is later removed from the culture medium, treated cells will return to the division cycle, showing a complex pattern of division times post-treatment, as first measured by Zetterberg and colleagues. We simulate these physiological responses of mammalian cells to transient inhibition of growth, using a set of nonlinear differential equations based on a realistic model of the molecular events underlying progression through the cell cycle. The model relies on our earlier work on the regulation of cyclin-dependent protein kinases during the cell division cycle of yeast. The yeast model is supplemented with equations describing the effects of retinoblastoma protein on cell growth and the synthesis of cyclins A and E, and with a primitive representation of the signaling pathway that controls synthesis of cyclin D.  相似文献   

The existence of a large number of proteins for which both nuclear magnetic resonance (NMR) and X-ray crystallographic coordinates have been deposited into the Protein Data Bank (PDB) makes the statistical comparison of the corresponding crystal and NMR structural models over a large data set possible, and facilitates the study of the effect of the crystal environment and other factors on structure. We present an approach for detecting statistically significant structural differences between crystal and NMR structural models which is based on structural superposition and the analysis of the distributions of atomic positions relative to a mean structure. We apply this to a set of 148 protein structure pairs (crystal vs NMR), and analyze the results in terms of methodological and physical sources of structural difference. For every one of the 148 structure pairs, the backbone root-mean-square distance (RMSD) over core atoms of the crystal structure to the mean NMR structure is larger than the average RMSD of the members of the NMR ensemble to the mean, with 76% of the structure pairs having an RMSD of the crystal structure to the mean more than a factor of two larger than the average RMSD of the NMR ensemble. On average, the backbone RMSD over core atoms of crystal structure to the mean NMR is approximately 1 A. If non-core atoms are included, this increases to 1.4 A due to the presence of variability in loops and similar regions of the protein. The observed structural differences are only weakly correlated with the age and quality of the structural model and differences in conditions under which the models were determined. We examine steric clashes when a putative crystalline lattice is constructed using a representative NMR structure, and find that repulsive crystal packing plays a minor role in the observed differences between crystal and NMR structures. The observed structural differences likely have a combination of physical and methodological causes. Stabilizing attractive interactions arising from intermolecular crystal contacts which shift the equilibrium of the crystal structure relative to the NMR structure is a likely physical source which can account for some of the observed differences. Methodological sources of apparent structural difference include insufficient sampling or other issues which could give rise to errors in the estimates of the precision and/or accuracy.  相似文献   

Mammalian cytochromes P450 (CYP) are enzymes of great biological and pharmaco-toxicological relevance. Due to their membrane-bound nature, the structural characterization of these proteins is extremely difficult, and therefore computational techniques, such as comparative modeling, may help obtaining reliable structures of members of this family. An important feature of CYP is the presence of an iron-containing porphyrin group at the enzyme active site. This calls for quantum chemical calculations to derive charges and parameters suitable for classical force field-based investigations of this proteins family. In this report, we first carried out density functional theory (DFT) computations to derive suitable charges for the Fe2+-containing heme group of P450 enzymes. Then, by means of the homology modeling technique, and taking advantage of the recently published crystal structure of the human CYP2C9, we built a new model of the human aromatase (CYP19) enzyme. Furthermore, to study the thermal stability of the new model as well as to test the suitability of the new DFT-based heme parameters, molecular dynamics (MD) simulations were carried out on both CYP2C9 and CYP19. Finally, the last few ns of aromatase MD trajectories were investigated following the essential dynamics protocol that allowed the detection of some correlated motions among some protein domains.  相似文献   

Leukocytes, as an indispensable arm of the immune system, need to be recruited from the flowing blood and transferred to the sites of infection. Their extravasation is feasible due to their ability to tether and roll over the activated endothelium, which is much dependent on the association of their selectin molecules with ligands on the activated endothelial cells. In view of the importance of this interaction for the physiological immune functions as well as for autoimmune diseases, specifying the affinity of selectins to their ligands at the single molecule level appears a challenging task to gain insight into the mechanisms that control leukocyte–endothelial avidity. To this end we functionalized substrates with P‐selectin and cantilever probes with its major ligand, the P‐selectin glycoprotein ligand‐1, and used atomic force microscopy to measure their unbinding force. Two different chemical protocols were used for the tethering of the molecules on the substrates, one based on a homobifunctional poly(ethylene glycol) linker and the other on the use of antibody‐specific binding. The unbinding forces measured with the two methods were 312 ± 149 and 230 ± 57 pN, respectively. Measurements on activated endothelials, declaratory of single molecule interactions, gave comparable results. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

