首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Genetic programming of the developmental processes in multicellular organisms is proposed to be so intricate and vitally important that a large set of genes is dedicated solely to this end. It is further proposed that this set can be compartmentalized into subsets on the basis of the changes in gene activities that occur during ontogenesis, and that the genes in each subset transiently control the epigenetic activities of a small group of cells. Automatic subset activation is achieved by the product of a gene in each subset that transfers activity specifically to the subset next in the developmental sequence. This device can generate a unidirectional series of activations that cascade hierarchically through development like toppling dominoes. The model provides a basis for developmental phenomena, such as pattern formation, morphogenesis, and regeneration, and it makes testable predictions at the molecular level.  相似文献   

2.
3.
Identification of differentially expressed (DE) genes across two conditions is a common task with microarray. Most existing approaches accomplish this goal by examining each gene separately based on a model and then control the false discovery rate over all genes. We took a different approach that employs a uniform platform to simultaneously depict the dynamics of the gene trajectories for all genes and select differently expressed genes. A new Functional Principal Component (FPC) approach is developed for time-course microarray data to borrow strength across genes. The approach is flexible as the temporal trajectory of the gene expressions is modeled nonparametrically through a set of orthogonal basis functions, and often fewer basis functions are needed to capture the shape of the gene expression trajectory than existing nonparametric methods. These basis functions are estimated from the data reflecting major modes of variation in the data. The correlation structure of the gene expressions over time is also incorporated without any parametric assumptions and estimated from all genes such that the information across other genes can be shared to infer one individual gene. Estimation of the parameters is carried out by an efficient hybrid EM algorithm. The performance of the proposed method across different scenarios was compared favorably in simulation to two-way mixed-effects ANOVA and the EDGE method using B-spline basis function. Application to the real data on C. elegans developmental stages also suggested that FPC analysis combined with hybrid EM algorithm provides a computationally fast and efficient method for identifying DE genes based on time-course microarray data.  相似文献   

4.
5.
Rooman M  Albert J  Dehouck Y  Haye A 《PloS one》2011,6(12):e27948
Available DNA microarray time series that record gene expression along the developmental stages of multicellular eukaryotes, or in unicellular organisms subject to external perturbations such as stress and diauxie, are analyzed. By pairwise comparison of the gene expression profiles on the basis of a translation-invariant and scale-invariant distance measure corresponding to least-rectangle regression, it is shown that peaks in the average distance values are noticeable and are localized around specific time points. These points systematically coincide with the transition points between developmental phases or just follow the external perturbations. This approach can thus be used to identify automatically, from microarray time series alone, the presence of external perturbations or the succession of developmental stages in arbitrary cell systems. Moreover, our results show that there is a striking similarity between the gene expression responses to these a priori very different phenomena. In contrast, the cell cycle does not involve a perturbation-like phase, but rather continuous gene expression remodeling. Similar analyses were conducted using three other standard distance measures, showing that the one we introduced was superior. Based on these findings, we set up an adapted clustering method that uses this distance measure and classifies the genes on the basis of their expression profiles within each developmental stage or between perturbation phases.  相似文献   

6.
MOTIVATION: Gene set analysis allows formal testing of subtle but coordinated changes in a group of genes, such as those defined by Gene Ontology (GO) or KEGG Pathway databases. We propose a new method for gene set analysis that is based on principal component analysis (PCA) of genes expression values in the gene set. PCA is an effective method for reducing high dimensionality and capture variations in gene expression values. However, one limitation with PCA is that the latent variable identified by the first PC may be unrelated to outcome. RESULTS: In the proposed supervised PCA (SPCA) model for gene set analysis, the PCs are estimated from a selected subset of genes that are associated with outcome. As outcome information is used in the gene selection step, this method is supervised, thus called the Supervised PCA model. Because of the gene selection step, test statistic in SPCA model can no longer be approximated well using t-distribution. We propose a two-component mixture distribution based on Gumbel exteme value distributions to account for the gene selection step. We show the proposed method compares favorably to currently available gene set analysis methods using simulated and real microarray data. SOFTWARE: The R code for the analysis used in this article are available upon request, we are currently working on implementing the proposed method in an R package.  相似文献   

7.
Studying the association between a gene set (e.g., pathway) and exposures using multivariate regression methods is of increasing importance in genomic studies. Such an analysis is often more powerful and interpretable than individual-gene analysis. Since many genes in a gene set are likely not affected by exposures, one is often interested in identifying a subset of genes in the gene set that are affected by exposures. This allows for better understanding of the underlying biological mechanism and for pursuing further biological investigation of these genes. The selected subset of ??signal?? genes also provides an attractive vehicle for a more powerful test for the association between the gene set and exposures. We propose two computationally simple Canonical Correlation Analysis (CCA) based variable selection methods: Sparse Outcome Selection (SOS) CCA and step CCA, to jointly select a subset of genes in a gene set that are associated with exposures. Several model selection criteria, such as BIC and the new Correlation Information Criterion (CIC), are proposed and compared. We also develop a global test procedure for testing the exposure effects on the whole gene set, accounting for gene selection. Through simulation studies, we show that the proposed methods improve upon an existing method when the genes are correlated and are more computationally efficient. We apply the proposed methods to the analysis of the Normative Aging DNA methylation Study to examine the effects of airborne particular matter exposures on DNA methylations in a genetic pathway.  相似文献   

8.
A model is proposed for space-dependent cell determination under the influence of a morphogen gradient. It provides an explanation of how groups of cells can be programmed in a particular direction and how a jump from one determination stage to the next can occur between them even though the controlling signal is of a smoothly graded morphogen concentration. Together with an earlier proposed mechanism for pattern formation, these models offer a complete system for the generation and interpretation of positional information. Each member of a set of structure-controlling genes is assumed to feed back onto its own activation such that a gene, once activated, remains in the activated state. A repressor, however, is produced by any activated gene of this set. This assures that only one gene of this set is active in one cell at any one time. A selective activation of a particular gene is possible if (i) the morphogen competes with the gene-produced, non-diffusible repressor, (ii) the feedback loops have some overlap and (iii) a hierarchy exists among the structure-controlling genes. The kinetics of this determination have all the properties demanded earlier from a study of the early insect development: It proceeds stepwise from determination for more anterior to more posterior structures until the gene that is activated corresponds to the local gradient level. A more anterior structure will be formed if the gradient is destroyed before the final determination level is reached. A more posterior structure will be formed after an additional increase of the morphogen concentration. After completion of the determination, the repressor concentration in each cell depends on which gene has become activated and it can be made roughly proportional to the morphogen concentration which the cell has seen. Therefore, a stable parameter (positional value) becomes available which can be used for further developmental decisions.  相似文献   

9.

Background  

Single Nucleotide Polymorphism (SNP) analysis only captures a small proportion of associated genetic variants in Genome-Wide Association Studies (GWAS) partly due to small marginal effects. Pathway level analysis incorporating prior biological information offers another way to analyze GWAS's of complex diseases, and promises to reveal the mechanisms leading to complex diseases. Biologically defined pathways are typically comprised of numerous genes. If only a subset of genes in the pathways is associated with disease then a joint analysis including all individual genes would result in a loss of power. To address this issue, we propose a pathway-based method that allows us to test for joint effects by using a pre-selected gene subset. In the proposed approach, each gene is considered as the basic unit, which reduces the number of genetic variants considered and hence reduces the degrees of freedom in the joint analysis. The proposed approach also can be used to investigate the joint effect of several genes in a candidate gene study.  相似文献   

10.
The recent availability of the chicken genome sequence poses the question of whether there are human protein-coding genes conserved in chicken that are currently not included in the human gene catalog. Here, we show, using comparative gene finding followed by experimental verification of exon pairs by RT–PCR, that the addition to the multi-exonic subset of this catalog could be as little as 0.2%, suggesting that we may be closing in on the human gene set. Our protocol, however, has two shortcomings: (i) the bioinformatic screening of the predicted genes, applied to filter out false positives, cannot handle intronless genes; and (ii) the experimental verification could fail to identify expression at a specific developmental time. This highlights the importance of developing methods that could provide a reliable estimate of the number of these two types of genes.  相似文献   

11.
Recent developments in microarray technology make it possible to capture the gene expression profiles for thousands of genes at once. With this data researchers are tackling problems ranging from the identification of 'cancer genes' to the formidable task of adding functional annotations to our rapidly growing gene databases. Specific research questions suggest patterns of gene expression that are interesting and informative: for instance, genes with large variance or groups of genes that are highly correlated. Cluster analysis and related techniques are proving to be very useful. However, such exploratory methods alone do not provide the opportunity to engage in statistical inference. Given the high dimensionality (thousands of genes) and small sample sizes (often <30) encountered in these datasets, an honest assessment of sampling variability is crucial and can prevent the over-interpretation of spurious results. We describe a statistical framework that encompasses many of the analytical goals in gene expression analysis; our framework is completely compatible with many of the current approaches and, in fact, can increase their utility. We propose the use of a deterministic rule, applied to the parameters of the gene expression distribution, to select a target subset of genes that are of biological interest. In addition to subset membership, the target subset can include information about relationships between genes, such as clustering. This target subset presents an interesting parameter that we can estimate by applying the rule to the sample statistics of microarray data. The parametric bootstrap, based on a multivariate normal model, is used to estimate the distribution of these estimated subsets and relevant summary measures of this sampling distribution are proposed. We focus on rules that operate on the mean and covariance. Using Bernstein's Inequality, we obtain consistency of the subset estimates, under the assumption that the sample size converges faster to infinity than the logarithm of the number of genes. We also provide a conservative sample size formula guaranteeing that the sample mean and sample covariance matrix are uniformly within a distance epsilon > 0 of the population mean and covariance. The practical performance of the method using a cluster-based subset rule is illustrated with a simulation study. The method is illustrated with an analysis of a publicly available leukemia data set.  相似文献   

12.
The venation patterns characteristics of different insect orders and of families belonging to the same order possess enormous variation in vein number, position and differentiation. Although the developmental basis of changes in vein patterns during evolution is entirely unknown, the identification of the genes and developmental processes involved in Drosophila vein pattern formation facilitates the elaboration of construction rules. It is thus possible to identify the likely changes which may constitute a source of pattern variation during evolution. In this review, we discuss how actual patterns of venation could be accounted for by modifications in different Pterygota of a common set of developmental operations. We argue that the individual specification of each vein and the modular structure of the regulatory regions of the key genes identified in Drosophila offer candidate entry points for pattern modifications affecting individual veins or interveins independently. Assuming a general conservation of the processes involved in different species, the transitions between different patterns may require few changes in the regulatory gene networks involved.  相似文献   

13.
Regulating developmental transitions, cell proliferation and cell death through differential gene expression is essential to the ontogeny of all multicellular organisms. Chromatin remodeling is an active process that is necessary for managing the genome-wide suppression of gene activities resulting from DNA compaction. Recent data in plants suggest a general theme, whereby chromatin remodeling complexes containing nuclear actin-related proteins (ARPs) potentiate the activities of crucial regulatory genes involved in plant growth and development, in addition to their basal activities on a much larger set of genes.  相似文献   

14.
15.
Recent research has demonstrated quite convincingly that accurate cancer diagnosis can be achieved by constructing classifiers that are designed to compare the gene expression profile of a tissue of unknown cancer status to a database of stored expression profiles from tissues of known cancer status. This paper introduces the JCFO, a novel algorithm that uses a sparse Bayesian approach to jointly identify both the optimal nonlinear classifier for diagnosis and the optimal set of genes on which to base that diagnosis. We show that the diagnostic classification accuracy of the proposed algorithm is superior to a number of current state-of-the-art methods in a full leave-one-out cross-validation study of five widely used benchmark datasets. In addition to its superior classification accuracy, the algorithm is designed to automatically identify a small subset of genes (typically around twenty in our experiments) that are capable of providing complete discriminatory information for diagnosis. Focusing attention on a small subset of genes is useful not only because it produces a classifier with good generalization capacity, but also because this set of genes may provide insights into the mechanisms responsible for the disease itself. A number of the genes identified by the JCFO in our experiments are already in use as clinical markers for cancer diagnosis; some of the remaining genes may be excellent candidates for further clinical investigation. If it is possible to identify a small set of genes that is indeed capable of providing complete discrimination, inexpensive diagnostic assays might be widely deployable in clinical settings.  相似文献   

16.
The CXB set of recombinant inbred mouse strains provided an opportunity to observe the effects of reassorted subsets of genes on the shape of the mandible. The distances between 12 landmarks in all paired combinations were calculated to evaluate genetic control in small regions. The genetic relationships between interlandmark distances revealed genes to have most of their effects in localized regions, and the greater heritabilities usually to apply to those distances between adjacent landmarks. Interrelationships between measurements are usually explicable on a developmental basis. It is proposed that genes of this sort bring about the changes seen in organ shape during evolution. A model plan for the organization of gene activation during morphogenesis is described.  相似文献   

17.
Filamentous soil bacteria of the genus Streptomyces carry out complex developmental cycles that result in sporulation and production of numerous secondary metabolites with pharmaceutically important activities. To further characterize the molecular basis of these developmental events, we screened for mutants of Streptomyces coelicolor that exhibit aberrant morphological differentiation and/or secondary metabolite production. On the basis of this screening analysis and the subsequent complementation analysis of the mutants obtained we assigned developmental roles to a gene involved in methionine biosynthesis (metH) and two previously uncharacterized genes (SCO6938 and SCO2525) and we reidentified two previously described developmental genes (bldA and bldM). In contrast to most previously studied genes involved in development, the genes newly identified in the present study all appear to encode biosynthetic enzymes instead of regulatory proteins. The MetH methionine synthase appears to be required for conversion of aerial hyphae into chains of spores, SCO6938 is a probable acyl coenzyme A dehydrogenase that contributes to the proper timing of aerial mycelium formation and antibiotic production, and SCO2525 is a putative methyltransferase that influences various aspects of colony growth and development.  相似文献   

18.
Most of the conventional feature selection algorithms have a drawback whereby a weakly ranked gene that could perform well in terms of classification accuracy with an appropriate subset of genes will be left out of the selection. Considering this shortcoming, we propose a feature selection algorithm in gene expression data analysis of sample classifications. The proposed algorithm first divides genes into subsets, the sizes of which are relatively small (roughly of size h), then selects informative smaller subsets of genes (of size r < h) from a subset and merges the chosen genes with another gene subset (of size r) to update the gene subset. We repeat this process until all subsets are merged into one informative subset. We illustrate the effectiveness of the proposed algorithm by analyzing three distinct gene expression data sets. Our method shows promising classification accuracy for all the test data sets. We also show the relevance of the selected genes in terms of their biological functions.  相似文献   

19.
Understanding the integrated behavior of genetic regulatory networks, in which genes regulate one another's activities via RNA and protein products, is emerging as a dominant problem in systems biology. One widely studied class of models of such networks includes genes whose expression values assume Boolean values (i.e., on or off). Design decisions in the development of Boolean network models of gene regulatory systems include the topology of the network (including the distribution of input- and output-connectivity) and the class of Boolean functions used by each gene (e.g., canalizing functions, post functions, etc.). For example, evidence from simulations suggests that biologically realistic dynamics can be produced by scale-free network topologies with canalizing Boolean functions. This work seeks further insights into the design of Boolean network models through the construction and analysis of a class of models that include more concrete biochemical mechanisms than the usual abstract model, including genes and gene products, dimerization, cis-binding sites, promoters and repressors. In this model, it is assumed that the system consists of N genes, with each gene producing one protein product. Proteins may form complexes such as dimers, trimers, etc. The model also includes cis-binding sites to which proteins may bind to form activators or repressors. Binding affinities are based on structural complementarity between proteins and binding sites, with molecular binding sites modeled by bit-strings. Biochemically plausible gene expression rules are used to derive a Boolean regulatory function for each gene in the system. The result is a network model in which both topological features and Boolean functions arise as emergent properties of the interactions of components at the biochemical level. A highly biased set of Boolean functions is observed in simulations of networks of various sizes, suggesting a new characterization of the subset of Boolean functions that are likely to appear in gene regulatory networks.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号