首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Two interesting results encountered in the literature concerning the Poisson and the negative binomial distributions are due to Moran (1952) and Patil & Seshadri (1964), respectively. Morans result provided a fundamental property of the Poisson distribution. Roughly speaking, he has shown that if Y, Z are independent, non-negative, integer-valued random variables with X = Y | Z then, under some mild restrictions, the conditional distribution of Y | X is binomial if and only if Y, Z are Poisson random variables. Motivated by Morans result Patil & Seshadri obtained a general characterization. A special case of this characterization suggests that, with conditions similar to those imposed by Moran, Y | X is negative hypergeometric if and only if Y, Z are negative binomials. In this paper we examine the results of Moran and Patil & Seshadri in the case where the conditional distribution of Y | X is truncated at an arbitrary point k – 1 (k = 1, 2, …). In fact we attempt to answer the question as to whether Morans property of the Poisson distribution, and subsequently Patil & Seshadris property of the negative binomial distribution, can be extended, in one form or another, to the case where Y | X is binomial truncated at k – 1 and negative hypergeometric truncated at k – 1 respectively.  相似文献   

2.
The purpose of this study was to investigate the combined influence of three-level, three-factor variables on the formulation of dacarbazine (a water-soluble drug) loaded cubosomes. Box–Behnken design was used to obtain a second-order polynomial equation with interaction terms to predict response values. In this study, the selected and coded variables X1, X2, and X3 representing the amount of monoolein, polymer, and drug as the independent variables, respectively. Fifteen runs of experiments were conducted, and the particle size (Y1) and encapsulation efficiency (Y2) were evaluated as dependent variables. We performed multiple regression to establish a full-model second-order polynomial equation relating independent and dependent variables. A second-order polynomial regression model was constructed for Y1 and confirmed by performing checkpoint analysis. The optimization process and Pareto charts were obtained automatically, and they predicted the levels of independent coded variables X1, X2, and X3 (−1, 0.53485, and −1, respectively) and minimized Y1 while maximizing Y2. These corresponded to a cubosome formulation made from 100 mg of monoolein, 107 mg of polymer, and 2 mg with average diameter of 104.7 nm and an encapsulation efficiency of 6.9%. The Box–Behnken design proved to be a useful tool to optimize the particle size of these drug-loaded cubosomes. For encapsulation efficiency (Y2), further studies are needed to identify appropriate regression model.Key words: Box–Behnken design, cubosomes, dacarbazine, formulation variables  相似文献   

3.
The construction of a dendogram on a set of individuals is a key component of a genomewide association study. However, even with modern sequencing technologies the distances on the individuals required for the construction of such a structure may not always be reliable making it tempting to exclude them from an analysis. This, in turn, results in an input set for dendogram construction that consists of only partial distance information, which raises the following fundamental question. For what (proper) subsets of a dendogram’s leaf set can we uniquely reconstruct the dendogram from the distances that it induces on the elements of such a subset? By formalizing a dendogram in terms of an edge-weighted, rooted, phylogenetic tree on a pre-given finite set X with |X|≥3 whose edge-weighting is equidistant and subsets Y of X for which the distances between every pair of elements in Y is known in terms of sets of 2-subsets of X, we investigate this problem from the perspective of when such a tree is lassoed, that is, uniquely determined by the elements in . For this, we consider four different formalizations of the idea of “uniquely determining” giving rise to four distinct types of lassos. We present characterizations for all of them in terms of the child-edge graphs of the interior vertices of such a tree. Our characterizations imply in particular that in case the tree in question is binary, then all four types of lasso must coincide.  相似文献   

4.
5.
One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs) using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene) is predicted from the expression patterns of all the other genes (input genes), using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn''t make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions.  相似文献   

6.

Motivation

Conventional identification methods for gene regulatory networks (GRNs) have overwhelmingly adopted static topology models, which remains unchanged over time to represent the underlying molecular interactions of a biological system. However, GRNs are dynamic in response to physiological and environmental changes. Although there is a rich literature in modeling static or temporally invariant networks, how to systematically recover these temporally changing networks remains a major and significant pressing challenge. The purpose of this study is to suggest a two-step strategy that recovers time-varying GRNs.

Results

It is suggested in this paper to utilize a switching auto-regressive model to describe the dynamics of time-varying GRNs, and a two-step strategy is proposed to recover the structure of time-varying GRNs. In the first step, the change points are detected by a Kalman-filter based method. The observed time series are divided into several segments using these detection results; and each time series segment belonging to two successive demarcating change points is associated with an individual static regulatory network. In the second step, conditional network structure identification methods are used to reconstruct the topology for each time interval. This two-step strategy efficiently decouples the change point detection problem and the topology inference problem. Simulation results show that the proposed strategy can detect the change points precisely and recover each individual topology structure effectively. Moreover, computation results with the developmental data of Drosophila Melanogaster show that the proposed change point detection procedure is also able to work effectively in real world applications and the change point estimation accuracy exceeds other existing approaches, which means the suggested strategy may also be helpful in solving actual GRN reconstruction problem.  相似文献   

7.
Abstract. Variation partitioning by (partial) constrained ordination is a popular method for exploratory data analysis, but applications are mostly restricted to simple ecological questions only involving two or three sets of explanatory variables, such as climate and soil, this because of the rapid increase in complexity of calculations and results with an increasing number of explanatory variable sets. The existence is demonstrated of a unique algorithm for partitioning the variation in a set of response variables on n sets of explanatory variables; it is shown how the 2n– 1 non‐overlapping components of variation can be calculated. Methods for evaluation and presentation of variation partitioning results are reviewed, and a recursive algorithm is proposed for distributing the many small components of variation over simpler components. Several issues related to the use and usefulness of variation partitioning with n sets of explanatory variables are discussed with reference to a worked example.  相似文献   

8.
The nucleolus organizers on the X and Y chromosomes of Drosophila melanogaster are the sites of 200-250 tandemly repeated genes for ribosomal RNA. As there is no meiotic crossing over in male Drosophila, the X and Y chromosomal rDNA arrays should be evolutionarily independent, and therefore divergent. The rRNAs produced by X and Y are, however, very similar, if not identical. Molecular, genetic and cytological analyses of a series of X chromosome rDNA deletions (bb alleles) showed that they arose by unequal exchange through the nucleolus organizers of the X and Y chromosomes. Three separate exchange events generated compound X·Y L chromosomes carrying mainly Y-specific rDNA. This led to the hypothesis that X-Y exchange is responsible for the coevolution of X and Y chromosomal rDNA. We have tested and confirmed several of the predictions of this hypothesis: First, X· YL chromosomes must be found in wild populations. We have found such a chromosome. Second, the X·YL chromosome must lose the YL arm, and/or be at a selective disadvantage to normal X+ chromosomes, to retain the normal morphology of the X chromosome. Six of seventeen sublines founded from homozygous X·YLbb stocks have become fixed for chromosomes with spontaneous loss of part or all of the appended YL. Third, rDNA variants on the X chromosome are expected to be clustered within the X+ nucleolus organizer, recently donated (" Y") forms being proximal, and X-specific forms distal. We present evidence for clustering of rRNA genes containing Type 1 insertions. Consequently, X-Y exchange is probably responsible for the coevolution of X and Y rDNA arrays.  相似文献   

9.
Haplotyping in pedigrees provides valuable information for genetic studies (e.g., linkage analysis and association study). In order to identify a set of haplotype configurations with the highest likelihoods for a large pedigree with a large number of linked loci, in our previous work, we proposed a conditional enumeration haplotyping method which sets a threshold for the conditional probabilities of the possible ordered genotypes at every unordered individual-marker to delete some ordered genotypes with low conditional probabilities and then eliminate some haplotype configurations with low likelihoods. In this article we present a rapid haplotyping algorithm based on a modification of our previous method by setting an additional threshold for the ratio of the conditional probability of a haplotype configuration to the largest conditional probability of all haplotype configurations in order to eliminate those configurations with relatively low conditional probabilities. The new algorithm is much more efficient than our previous method and the widely used software SimWalk2.  相似文献   

10.
Maroni G  Plaut W 《Genetics》1973,74(2):331-342
The level of activity of the enzyme glucose-6-phosphate dehydrogenase was determinel in flies having seven different chromosomic constitutions. All those having an integral number of chromosomes [XAA, XXAA, XAAA, XXAAA, and XXXAAA (X=X chromosome, A=set of autosomes)] were found to have similar units of enzyme activity/mg live weight, while diploid females with a duplication and triploid females with a deficiency showed dosage effect. The amount of enzyme activity per cell, on the other hand, is also independent of the number of X's present but appears roughly proportional to the number of sets of autosomes.—It is proposed that dosage-compensated sex-linked genes are controlled by a positively acting regulatory factor(s) of autosomal origin. With this hypothesis it is possible to explain dosage compensation as a consequence of general regulatory mechanisms without invoking a special device which applies only to the X chromosomes.  相似文献   

11.
The fundamental properties of a punctured normal distribution are studied. The results are applied to three issues concerning X/Y where X and Y are independent normal random variables with means μX and μY respectively. First, estimation of μXY as a surrogate for E(X/Y) is justified, then the reason for preference of a weighted average, over an arithmetic average, as an estimator of μXY is given. Finally, an approximate confidence interval for μXY is provided. A grain yield data set is used to illustrate the results. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

12.
Inference of gene regulatory networks (GRNs) is one of the most challenging research problems of Systems Biology. In this investigation, a new GRNs inference methodology, called Entropic Biological Score (EBS), which linearly combines the mean conditional entropy (MCE) from expression levels and a Biological Score (BS), obtained by integrating different biological data sources, is proposed. The EBS is validated with the Cell Cycle related functional annotation information, available from Munich Information Center for Protein Sequences (MIPS), and compared with some existing methods like MRNET, ARACNE, CLR and MCE for GRNs inference. For real networks, the performance of EBS, which uses the concept of integrating different data sources, is found to be superior to the aforementioned inference methods. The best results for EBS are obtained by considering the weights w1 = 0.2 and w2 = 0.8 for MCE and BS values, respectively, where approximately 40% of the inferred connections are found to be correct and significantly better than related methods. The results also indicate that expression profile is able to recover some true connections, that are not present in biological annotations, thus leading to the possibility of discovering new relations between its genes.  相似文献   

13.
Research in evolutionary psychology, and life history theory in particular, has yielded important insights into the developmental processes that underpin variation in growth, psychological functioning, and behavioral outcomes across individuals. Yet, there are methodological concerns that limit the ability to draw causal inferences about human development and psychological functioning within a life history framework. The current study used a simulation-based modeling approach to estimate the degree of genetic confounding in tests of a well-researched life history hypothesis: that father absence (X) is associated with earlier age at menarche (Y). The results demonstrate that the genetic correlation between X and Y can confound the phenotypic association between the two variables, even if the genetic correlation is small—suggesting that failure to control for the genetic correlation between X and Y could produce a spurious phenotypic correlation. We discuss the implications of these results for research on human life history, and highlight the utility of incorporating genetically sensitive tests into future life history research.  相似文献   

14.
Mutual information (MI), a quantity describing the nonlinear dependence between two random variables, has been widely used to construct gene regulatory networks (GRNs). Despite its good performance, MI cannot separate the direct regulations from indirect ones among genes. Although the conditional mutual information (CMI) is able to identify the direct regulations, it generally underestimates the regulation strength, i.e. it may result in false negatives when inferring gene regulations. In this work, to overcome the problems, we propose a novel concept, namely conditional mutual inclusive information (CMI2), to describe the regulations between genes. Furthermore, with CMI2, we develop a new approach, namely CMI2NI (CMI2-based network inference), for reverse-engineering GRNs. In CMI2NI, CMI2 is used to quantify the mutual information between two genes given a third one through calculating the Kullback–Leibler divergence between the postulated distributions of including and excluding the edge between the two genes. The benchmark results on the GRNs from DREAM challenge as well as the SOS DNA repair network in Escherichia coli demonstrate the superior performance of CMI2NI. Specifically, even for gene expression data with small sample size, CMI2NI can not only infer the correct topology of the regulation networks but also accurately quantify the regulation strength between genes. As a case study, CMI2NI was also used to reconstruct cancer-specific GRNs using gene expression data from The Cancer Genome Atlas (TCGA). CMI2NI is freely accessible at http://www.comp-sysbio.org/cmi2ni.  相似文献   

15.
On optimal nonlinear associative recall   总被引:6,自引:0,他引:6  
The problem of determining the nonlinear function (“blackbox”) which optimally associates (on given criteria) two sets of data is considered. The data are given as discrete, finite column vectors, forming two matricesX (“input”) andY (“output”) with the same numbers of columns and an arbitrary numbers of rows. An iteration method based on the concept of the generalized inverse of a matrix provides the polynomial mapping of degreek onX by whichY is retrieved in an optimal way in the least squares sense. The results can be applied to a wide class of problems since such polynomial mappings may approximate any continuous real function from the “input” space to the “output” space to any required degree of accuracy. Conditions under which the optimal estimate is linear are given. Linear transformations on the input key-vectors and analogies with the “whitening” approach are also discussed. Conditions of “stationarity” on the processes of whichX andY are assumed to represent a set of sample sequences can be easily introduced. The optimal linear estimate is given by a discrete counterpart of the Wiener-Hopf equation and, if the key-signals are noise-like, the holographic-like scheme of associative memory is obtained, as the optimal nonlinear estimator. The theory can be applied to the system identification problem. It is finally suggested that the results outlined here may be relevant to the construction of models of associative, distributed memory.  相似文献   

16.
A specific regular inbreeding system of quadruple half-second cousin mating is considered. A regular inbreeding system can be thought of as a graph which satisfies certain natural homogeneity properties. Random walks X n and Y n are introduced on the nodes of the graph; the event {X n=Yn} is a renewal event by the homogeneity properties. In Arzberger (1985) it is shown that 1) graphs associated with left cancellative semigroups are regular, and 2) for regular systems, the population becomes genetically uniform if and only if the event {X n=Yn} is recurrent. In Arzberger (1986) the system of quadruple half-second cousin mating is associated with a cancellative semigroup, thus the system is regular. In this paper we show that 1) An is asymptotically of the form cn 3, where A n is the number of ancestors n generations into the past, 2) {X n=Yn} is not recurrent (this is shown by associating (X n, Y n) with a random walk in Z 3, 3) P[X 3n =Y 3n ] is asymptotically of the form cn –3/2. Thus, in this example, genetic heterogeneity is maintained, with a cubic rate of growth for An, not by an exponential growth rate, as in all previous examples of regular inbreeding systems in which genetic heterogeneity is maintained.  相似文献   

17.
In this article, we introduce an exploratory framework for learning patterns of conditional co-expression in gene expression data. The main idea behind the proposed approach consists of estimating how the information content shared by a set of M nodes in a network (where each node is associated to an expression profile) varies upon conditioning on a set of L conditioning variables (in the simplest case represented by a separate set of expression profiles). The method is non-parametric and it is based on the concept of statistical co-information, which, unlike conventional correlation based techniques, is not restricted in scope to linear conditional dependency patterns. Moreover, such conditional co-expression relationships can potentially indicate regulatory interactions that do not manifest themselves when only pair-wise relationships are considered. A moment based approximation of the co-information measure is derived that efficiently gets around the problem of estimating high-dimensional multi-variate probability density functions from the data, a task usually not viable due to the intrinsic sample size limitations that characterize expression level measurements. By applying the proposed exploratory method, we analyzed a whole genome microarray assay of the eukaryote Saccharomices cerevisiae and were able to learn statistically significant patterns of conditional co-expression. A selection of such interactions that carry a meaningful biological interpretation are discussed.  相似文献   

18.
The inference of gene regulatory network (GRN) from gene expression data is an unsolved problem of great importance. This inference has been stated, though not proven, to be underdetermined implying that there could be many equivalent (indistinguishable) solutions. Motivated by this fundamental limitation, we have developed new framework and algorithm, called TRaCE, for the ensemble inference of GRNs. The ensemble corresponds to the inherent uncertainty associated with discriminating direct and indirect gene regulations from steady-state data of gene knock-out (KO) experiments. We applied TRaCE to analyze the inferability of random GRNs and the GRNs of E. coli and yeast from single- and double-gene KO experiments. The results showed that, with the exception of networks with very few edges, GRNs are typically not inferable even when the data are ideal (unbiased and noise-free). Finally, we compared the performance of TRaCE with top performing methods of DREAM4 in silico network inference challenge.  相似文献   

19.
W. Kunz 《Genetics》1976,82(1):25-34
The number of rRNA cistrons is measured by filter saturation hybridization in different stocks of D. hydei, where the wild-type X chromosome has one nucleolus organizer (NO) and the wild-type Y has two separated NO's. (see PDF) females having no X chromosomal NO show an rDNA content exceeding that of a Y chromosome. An even greater increase in the rRNA cistron number is measured in two translocation stocks where the (see PDF) is combined with one half of a Y and, therefore, each stock contains only one of the two Y chromosomal NO's. But when the same Y fragments are brought together with a wild-type X chromosome they lose about one-half of their rRNA cistrons within one generation. Males with two complementary Y fragments but having no X chromosomal NO show a considerably higher rDNA content than the (see PDF) females, although both are equal in respect of their NO number. Consideration is given to related phenomena in Drosophila melanogaster.  相似文献   

20.
Semiparametric regression estimation in the presence of dependent censoring   总被引:5,自引:0,他引:5  
We propose a semiparametric estimation procedure for estimatingthe regression of an outcome Y, measured at the end of a fixedfollow-up period, on baseline explanatory variables X, measuredprior to start of follow-up, in the presence of dependent censoringgiven X. The proposed estimators are consistent when the dataare ‘missing at random’ but not ‘missing completelyat random’ (Rubin, 1976), and do not require full specificationof the complete data likelihood. Specifically, we assume thatthe probability of censoring at time t is independent of theoutcome Y conditional on the recorded history up to t of a vectorof time-dependent covariates that are correlated with Y. Ourestimators can be used to adjust for dependent censoring andnonrandom noncompliance in randomised trials studying the effectof a treatment on the mean of a response variable of interest.Even with independent censoring, our methods allow the investigatorto increase efficiency by exploiting the correlation of theoutcome with a vector of time-dependent covariates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号