首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Ectopic pregnancy (EP) is an enigmatic reproductive disorder. Although tubal EP is difficult to predict, several hypotheses about its etiology have been proposed. In retrospective case-control studies, smoking is associated with an increased rate of EPs in the fallopian tube. Studies of experimental animals in vivo and human fallopian tubal tissues in vitro have suggested mechanisms of fallopian tubal damage and dysfunction induced by nicotine and other smoking-related chemicals that may explain this association. However, the pathogenesis of smoking-induced modulation of implantation leading to tubal EP is largely unknown. Because cigarette/tobacco smoke adversely affects the success of intrauterine implantation, there is a great need to determine how embryo implantation occurs in the fallopian tube in female smokers of reproductive age.  相似文献   

3.
4.
Variables measured in longitudinal studies of aging and longevity do not exhaust the list of all factors affecting health and mortality transitions. Unobserved factors generate hidden variability in susceptibility to diseases and death in populations and in age trajectories of longitudinally measured indices. Effects of such heterogeneity can be manifested not only in observed hazard rates but also in average trajectories of measured indices. Although effects of hidden heterogeneity on observed mortality rates are widely discussed, their role in forming age patterns of other aging-related characteristics (average trajectories of physiological state, stress resistance, etc.) is less clear. We propose a model of hidden heterogeneity to analyze its effects in longitudinal data. The approach takes the presence of hidden heterogeneity into account and incorporates several major concepts currently developing in aging research (allostatic load, aging-associated decline in adaptive capacity and stress-resistance, age-dependent physiological norms). Simulation experiments confirm identifiability of model's parameters.  相似文献   

5.
MOTIVATION: The result of a typical microarray experiment is a long list of genes with corresponding expression measurements. This list is only the starting point for a meaningful biological interpretation. Modern methods identify relevant biological processes or functions from gene expression data by scoring the statistical significance of predefined functional gene groups, e.g. based on Gene Ontology (GO). We develop methods that increase the explanatory power of this approach by integrating knowledge about relationships between the GO terms into the calculation of the statistical significance. RESULTS: We present two novel algorithms that improve GO group scoring using the underlying GO graph topology. The algorithms are evaluated on real and simulated gene expression data. We show that both methods eliminate local dependencies between GO terms and point to relevant areas in the GO graph that remain undetected with state-of-the-art algorithms for scoring functional terms. A simulation study demonstrates that the new methods exhibit a higher level of detecting relevant biological terms than competing methods.  相似文献   

6.
Many of the steps in phylogenetic reconstruction can be confounded by “rogue” taxa—taxa that cannot be placed with assurance anywhere within the tree, indeed, whose location within the tree varies with almost any choice of algorithm or parameters. Phylogenetic consensus methods, in particular, are known to suffer from this problem. In this paper, we provide a novel framework to define and identify rogue taxa. In this framework, we formulate a bicriterion optimization problem, the relative information criterion, that models the net increase in useful information present in the consensus tree when certain taxa are removed from the input data. We also provide an effective greedy heuristic to identify a subset of rogue taxa and use this heuristic in a series of experiments, with both pathological examples from the literature and a collection of large biological data sets. As the presence of rogue taxa in a set of bootstrap replicates can lead to deceivingly poor support values, we propose a procedure to recompute support values in light of the rogue taxa identified by our algorithm; applying this procedure to our biological data sets caused a large number of edges to move from “unsupported” to “supported” status, indicating that many existing phylogenies should be recomputed and reevaluated to reduce any inaccuracies introduced by rogue taxa. We also discuss the implementation issues encountered while integrating our algorithm into RAxML v7.2.7, particularly those dealing with scaling up the analyses. This integration enables practitioners to benefit from our algorithm in the analysis of very large data sets (up to 2,500 taxa and 10,000 trees, although we present the results of even larger analyses).  相似文献   

7.
The intermediary steps between a biological hypothesis, concretized in the input data, and meaningful results, validated using biological experiments, commonly employ bioinformatics tools. Starting with storage of the data and ending with a statistical analysis of the significance of the results, every step in a bioinformatics analysis has been intensively studied and the resulting methods and models patented. This review summarizes the bioinformatics patents that have been developed mainly for the study of genes, and points out the universal applicability of bioinformatics methods to other related studies such as RNA interference. More specifically, we overview the steps undertaken in the majority of bioinformatics analyses, highlighting, for each, various approaches that have been developed to reveal details from different perspectives. First we consider data warehousing, the first task that has to be performed efficiently, optimizing the structure of the database, in order to facilitate both the subsequent steps and the retrieval of information. Next, we review data mining, which occupies the central part of most bioinformatics analyses, presenting patents concerning differential expression, unsupervised and supervised learning. Last, we discuss how networks of interactions of genes or other players in the cell may be created, which help draw biological conclusions and have been described in several patents.  相似文献   

8.
In phylogenetic analysis, support for a given clade is ‘hidden’ when isolated partitions support that clade less than in the analysis of combined data sets. In such simultaneous analyses, signal common to the majority of partitions dominates the topology at the expense of any signal idiosyncratic to each partition. This process is often referred to as synergy and is commonly used to validate the combination of disparate data partitions. We investigate the behaviour of hidden branch support (HBS), partitioned branch support (PBS) and hidden synapomorphy (HS) as measures of hidden support using artificial, real and experimentally manipulated phylogenetic data sets. Our analyses demonstrate that high levels of both HBS and HS can be obtained by combining data with little shared phylogenetic signal. This finding is in agreement with the original intent of hidden support metrics, which essentially quantify the extent of data set interaction, both through the dispersion of homoplasy and revelation of underlying shared signal (positive data synergy). High levels of HBS alone are insufficient to justify data combination. We advocate the use of multiple hidden support measures to distinguish between the dispersion of homoplasy and positive data synergy, and to better interpret data interactions. Furthermore, we suggest two criteria that help identify hidden support resulting from homoplasy dispersion: first, when total support decreases with the addition of a data partition and second, when total HBS per unit total support (TS) per node is similar to that derived from randomized data.  相似文献   

9.
A large class of neural network models have their units organized in a lattice with fixed topology or generate their topology during the learning process. These network models can be used as neighborhood preserving map of the input manifold, but such a structure is difficult to manage since these maps are graphs with a number of nodes that is just one or two orders of magnitude less than the number of input points (i.e., the complexity of the map is comparable with the complexity of the manifold) and some hierarchical algorithms were proposed in order to obtain a high-level abstraction of these structures. In this paper a general structure capable to extract high order information from the graph generated by a large class of self-organizing networks is presented. This algorithm will allow to build a two layers hierarchical structure starting from the results obtained by using the suitable neural network for the distribution of the input data. Moreover the proposed algorithm is also capable to build a topology preserving map if it is trained using a graph that is also a topology preserving map.  相似文献   

10.
In the past several years many linear models have been proposed for analyzing two-color microarray data. As presented in the literature, many of these models appear dramatically different. However, many of these models are reformulations of the same basic approach to analyzing microarray data. This paper demonstrates the equivalence of some of these models. Attention is directed at choices in microarray data analysis that have a larger impact on the results than the choice of linear model.  相似文献   

11.
Recent work has used graphs to modelize expression data from microarray experiments, in view of partitioning the genes into clusters. In this paper, we introduce the use of a decomposition by clique separators. Our aim is to improve the classical clustering methods in two ways: first we want to allow an overlap between clusters, as this seems biologically sound, and second we want to be guided by the structure of the graph to define the number of clusters. We test this approach with a well-known yeast database (Saccharomyces cerevisiae). Our results are good, as the expression profiles of the clusters we find are very coherent. Moreover, we are able to organize into another graph the clusters we find, and order them in a fashion which turns out to respect the chronological order defined by the the sporulation process.  相似文献   

12.
There is a growing interest in the identification of proteins on the proteome wide scale. Among different kinds of protein structure identification methods, graph-theoretic methods are very sharp ones. Due to their lower costs, higher effectiveness and many other advantages, they have drawn more and more researchers' attention nowadays. Specifically, graph-theoretic methods have been widely used in homology identification, side-chain cluster identification, peptide sequencing and so on. This paper reviews several methods in solving protein structure identification problems using graph theory. We mainly introduce classical methods and mathematical models including homology modeling based on clique finding, identification of side-chain clusters in protein structures upon graph spectrum, and de novo peptide sequencing via tandem mass spectrometry using the spectrum graph model. In addition, concluding remarks and future priorities of each method are given.  相似文献   

13.
By measuring prevailing distances between YY, YR, RR, and RY dinucleotides in the large database of the nucleosome DNA fragments from C. elegans, the consensus sequence structure of the nucleosome DNA repeat of C. elegans was reconstructed: (YYYYYRRRRR)n. An actual period was estimated to be 10.4 bases. The pattern is fully consistent with the nucleosome DNA patterns of other eukaryotes, as established earlier, and, thus, the YYYYYRRRRR repeat can be considered as consensus nucleosome DNA sequence repeat across eukaryotic species. Similar distance analysis for [A, T] dinucleotides suggested the related pattern (TTTYTARAAA)n where the TT and AA dinucleotides display rather out of phase behavior, contrary to the "AA or TT" in-phase periodicity, considered in some publications. A weak 5-base periodicity in the distribution of TA dinucleotides was detected.  相似文献   

14.
Here we present cryoelectron crystallographic analysis of an isolated dimeric oxygen-evolving complex of photosystem II (at a resolution of approximately 0.9 nm), revealing that the D1-D2 reaction center (RC) proteins are centrally located between the chlorophyll-binding proteins, CP43 and CP47. This conclusion supports the hypothesis that photosystems I and II have similar structural features and share a common evolutionary origin. Additional density connecting the two halves of the dimer, which was not observed in a recently described CP47-RC complex that did not include CP43, may be attributed to the small subunits that are involved in regulating secondary electron transfer, such as PsbH. These subunits are possibly also required for stabilization of the dimeric photosystem II complex. This complex, containing at least 29 transmembrane helices in its asymmetric unit, represents one of the largest membrane protein complexes studied at this resolution.  相似文献   

15.
The generation interval is the time between the infection time of an infected person and the infection time of his or her infector. Probability density functions for generation intervals have been an important input for epidemic models and epidemic data analysis. In this paper, we specify a general stochastic SIR epidemic model and prove that the mean generation interval decreases when susceptible persons are at risk of infectious contact from multiple sources. The intuition behind this is that when a susceptible person has multiple potential infectors, there is a "race" to infect him or her in which only the first infectious contact leads to infection. In an epidemic, the mean generation interval contracts as the prevalence of infection increases. We call this global competition among potential infectors. When there is rapid transmission within clusters of contacts, generation interval contraction can be caused by a high local prevalence of infection even when the global prevalence is low. We call this local competition among potential infectors. Using simulations, we illustrate both types of competition. Finally, we show that hazards of infectious contact can be used instead of generation intervals to estimate the time course of the effective reproductive number in an epidemic. This approach leads naturally to partial likelihoods for epidemic data that are very similar to those that arise in survival analysis, opening a promising avenue of methodological research in infectious disease epidemiology.  相似文献   

16.
17.
18.
The recent demonstration that biochemical pathways from diverse organisms are arranged in scale-free, rather than random, systems [Jeong et al., Nature 407 (2000) 651-654], emphasizes the importance of developing methods for the identification of biochemical nexuses--the nodes within biochemical pathways that serve as the major input/output hubs, and therefore represent potentially important targets for modulation. Here we describe a bioinformatics approach that identifies candidate nexuses for biochemical pathways without requiring functional gene annotation; we also provide proof-of-principle experiments to support this technique. This approach, called Nexxus, may lead to the identification of new signal transduction pathways and targets for drug design.  相似文献   

19.
DNA amplifications and deletions characterize cancer genome and are often related to disease evolution. Microarray-based techniques for measuring these DNA copy-number changes use fluorescence ratios at arrayed DNA elements (BACs, cDNA, or oligonucleotides) to provide signals at high resolution, in terms of genomic locations. These data are then further analyzed to map aberrations and boundaries and identify biologically significant structures. We develop a statistical framework that enables the casting of several DNA copy number data analysis questions as optimization problems over real-valued vectors of signals. The simplest form of the optimization problem seeks to maximize phi(I) = Sigmanu(i)/radical|I| over all subintervals I in the input vector. We present and prove a linear time approximation scheme for this problem, namely, a process with time complexity O (nepsilon(-2)) that outputs an interval for which phi(I) is at least Opt/alpha(epsilon), where Opt is the actual optimum and alpha(epsilon) --> 1 as epsilon --> 0. We further develop practical implementations that improve the performance of the naive quadratic approach by orders of magnitude. We discuss properties of optimal intervals and how they apply to the algorithm performance. We benchmark our algorithms on synthetic as well as publicly available DNA copy number data. We demonstrate the use of these methods for identifying aberrations in single samples as well as common alterations in fixed sets and subsets of breast cancer samples.  相似文献   

20.
Click-evoked otoacoustic emissions (CEOAEs) were studied by means of recurrence quantification analysis (RQA) and were found to be endowed with a relevant amount of deterministic structuring. Such a structure showed highly significant correlation with the clinical evaluation of the signal over a data set including 56 signals. Moreover, 1) one of the RQA variables, Trend, was very sensitive to phase transitions in the dynamical regime of CEOAEs, and 2) appropriate use of principal component analysis proved able to isolate the individual character of the studied signals. These results are of general interest for the study of auditory signal transduction and generation mechanisms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号