期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A biologist's guide to model selection and causal inference

Zachary M. Laubach Eleanor J. Murray Kim L. Hoke Rebecca J. Safran Wei Perng 《Proceedings. Biological sciences / The Royal Society》2021,288(1943)

相似文献

2.

The Teacher,the Physician and the Person: Exploring Causal Connections between Teaching Performance and Role Model Types Using Directed Acyclic Graphs

Benjamin C. M. Boerebach Kiki M. J. M. H. Lombarts Albert J. J. Scherpbier Onyebuchi A. Arah 《PloS one》2013,8(7)

Background

In fledgling areas of research, evidence supporting causal assumptions is often scarce due to the small number of empirical studies conducted. In many studies it remains unclear what impact explicit and implicit causal assumptions have on the research findings; only the primary assumptions of the researchers are often presented. This is particularly true for research on the effect of faculty’s teaching performance on their role modeling. Therefore, there is a need for robust frameworks and methods for transparent formal presentation of the underlying causal assumptions used in assessing the causal effects of teaching performance on role modeling. This study explores the effects of different (plausible) causal assumptions on research outcomes.

Methods

This study revisits a previously published study about the influence of faculty’s teaching performance on their role modeling (as teacher-supervisor, physician and person). We drew eight directed acyclic graphs (DAGs) to visually represent different plausible causal relationships between the variables under study. These DAGs were subsequently translated into corresponding statistical models, and regression analyses were performed to estimate the associations between teaching performance and role modeling.

Results

The different causal models were compatible with major differences in the magnitude of the relationship between faculty’s teaching performance and their role modeling. Odds ratios for the associations between teaching performance and the three role model types ranged from 31.1 to 73.6 for the teacher-supervisor role, from 3.7 to 15.5 for the physician role, and from 2.8 to 13.8 for the person role.

Conclusions

Different sets of assumptions about causal relationships in role modeling research can be visually depicted using DAGs, which are then used to guide both statistical analysis and interpretation of results. Since study conclusions can be sensitive to different causal assumptions, results should be interpreted in the light of causal assumptions made in each study. 相似文献

3.

Phylogenetic networks: modeling, reconstructibility, and accuracy 总被引：1，自引：0，他引：1

Moret BM Nakhleh L Warnow T Linder CR Tholse A Padolina A Sun J Timme R 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2004,1(1):13-23

Phylogenetic networks model the evolutionary history of sets of organisms when events such as hybrid speciation and horizontal gene transfer occur. In spite of their widely acknowledged importance in evolutionary biology, phylogenetic networks have so far been studied mostly for specific data sets. We present a general definition of phylogenetic networks in terms of directed acyclic graphs (DAGs) and a set of conditions. Further, we distinguish between model networks and reconstructible ones and characterize the effect of extinction and taxon sampling on the reconstructibility of the network. Simulation studies are a standard technique for assessing the performance of phylogenetic methods. A main step in such studies entails quantifying the topological error between the model and inferred phylogenies. While many measures of tree topological accuracy have been proposed, none exist for phylogenetic networks. Previously, we proposed the first such measure, which applied only to a restricted class of networks. In this paper, we extend that measure to apply to all networks, and prove that it is a metric on the space of phylogenetic networks. Our results allow for the systematic study of existing network methods, and for the design of new accurate ones. 相似文献

4.

Food web networks: Scaling relation revisited

《Ecological Complexity》2005,2(4):323-338

相似文献

5.

Gaining confidence in high-throughput protein interaction networks 总被引：1，自引：0，他引：1

Bader JS Chaudhuri A Rothberg JM Chant J 《Nature biotechnology》2004,22(1):78-85

相似文献

6.

Graph-based iterative Group Analysis enhances microarray interpretation

Rainer?Breitling Email author Anna?Amtmann Pawel?Herzyk 《BMC bioinformatics》2004,5(1):100

Background

One of the most time-consuming tasks after performing a gene expression experiment is the biological interpretation of the results by identifying physiologically important associations between the differentially expressed genes. A large part of the relevant functional evidence can be represented in the form of graphs, e.g. metabolic and signaling pathways, protein interaction maps, shared GeneOntology annotations, or literature co-citation relations. Such graphs are easily constructed from available genome annotation data. The problem of biological interpretation can then be described as identifying the subgraphs showing the most significant patterns of gene expression. We applied a graph-based extension of our iterative Group Analysis (iGA) approach to obtain a statistically rigorous identification of the subgraphs of interest in any evidence graph. 相似文献

7.

Class Prediction and Feature Selection with Linear Optimization for Metagenomic Count Data

Zhenqiu Liu Dechang Chen Li Sheng Amy Y. Liu 《PloS one》2013,8(3)

The amount of metagenomic data is growing rapidly while the computational methods for metagenome analysis are still in their infancy. It is important to develop novel statistical learning tools for the prediction of associations between bacterial communities and disease phenotypes and for the detection of differentially abundant features. In this study, we presented a novel statistical learning method for simultaneous association prediction and feature selection with metagenomic samples from two or multiple treatment populations on the basis of count data. We developed a linear programming based support vector machine with and joint penalties for binary and multiclass classifications with metagenomic count data (metalinprog). We evaluated the performance of our method on several real and simulation datasets. The proposed method can simultaneously identify features and predict classes with the metagenomic count data. 相似文献

8.

The emergence of scaling in sequence-based physical models of protein evolution

下载免费PDF全文

Deeds EJ Shakhnovich EI 《Biophysical journal》2005,88(6):3905-3911

It has recently been discovered that many biological systems, when represented as graphs, exhibit a scale-free topology. One such system is the set of structural relationships among protein domains. The scale-free nature of this and other systems has previously been explained using network growth models that, although motivated by biological processes, do not explicitly consider the underlying physics or biology. In this work we explore a sequence-based model for the evolution protein structures and demonstrate that this model is able to recapitulate the scale-free nature observed in graphs of real protein structures. We find that this model also reproduces other statistical feature of the protein domain graph. This represents, to our knowledge, the first such microscopic, physics-based evolutionary model for a scale-free network of biological importance and as such has strong implications for our understanding of the evolution of protein structures and of other biological networks. 相似文献

9.

Detecting hierarchical structure in molecular characteristics of disease using transitive approximations of directed graphs

Jacob J Jentsch M Kostka D Bentink S Spang R 《Bioinformatics (Oxford, England)》2008,24(7):995-1001

MOTIVATION: Molecular diagnostics aims at classifying diseases into clinically relevant sub-entities based on molecular characteristics. Typically, the entities are split into subgroups, which might contain several variants yielding a hierarchical model of the disease. Recent years have introduced a plethora of new molecular screening technologies to molecular diagnostics. As a result molecular profiles of patients became complex and the classification task more difficult. RESULTS: We present a novel tool for detecting hierarchical structure in binary datasets. We aim for identifying molecular characteristics, which are stochastically implying other characteristics. The final hierarchical structure is encoded in a directed transitive graph where nodes represent molecular characteristics and a directed edge from a node A to a node B denotes that almost all cases with characteristic B also display characteristic A. Naturally, these graphs need to be transitive. In the core of our modeling approach lies the problem of calculating good transitive approximations of given directed but not necessarily transitive graphs. By good transitive approximation we understand transitive graphs, which differ from the reference graph in only a small number of edges. It is known that the problem of finding optimal transitive approximation is NP-complete. Here we develop an efficient heuristic for generating good transitive approximations. We evaluate the computational efficiency of the algorithm in simulations, and demonstrate its use in the context of a large genome-wide study on mature aggressive lymphomas. AVAILABILITY: The software used in our analysis is freely available from http://compdiag.uni-regensburg.de/software/transApproxs.shtml. 相似文献

10.

Can Sibling Sex Ratios Be Used as a Valid Test for the Prenatal Androgen Hypothesis of Autism Spectrum Disorders?

Keely Cheslack-Postava Ezra Susser Kayuet Liu Peter S. Bearman 《PloS one》2015,10(10)

Background

Sibling sex ratios have been applied as an indirect test of a hypothesized association between prenatal testosterone levels and risk for autism, a developmental disorder disproportionately affecting males. Differences in sibling sex ratios between those with and without autism would provide evidence of a shared risk factor for autism and offspring sex. Conclusions related to prenatal testosterone, however, require additional assumptions. Here, we used directed acyclic graphs (DAGs) to clarify the elements required for a valid test of the hypothesis that sibling sex ratios differ between children with and without autism. We then conducted such a test using a large, population-based sample of children.

Methods

Over 1.1 million subjects, born in California from 1992–2007, and identified through birth records, were included. The association between autism diagnosis, determined using the administrative database of the California Department of Developmental Services, and the sex of the subsequent sibling was examined using generalized estimating equations. Sources of potential bias identified using DAGs were addressed.

Results

Among male children with autism, 52.2% of next-born siblings were brothers, versus 51.0% for unaffected males. For females with autism, 50.2% of following siblings were brothers versus 51.2% among control females. The relative risk of a subsequent male sibling associated with autism diagnosis was 1.02 (95% confidence interval: 0.99, 1.04).

Conclusions

In a large, population-based sample we failed to find evidence suggesting an excess of brothers among children with autism while controlling for several threats to validity. This test cannot rule out a role of any given exposure, including prenatal testosterone, in either risk of autism or offspring sex ratio, but suggests against a common cause of both. 相似文献

11.

Gene recognition based on DAG shortest paths

Chuang JS Roth D 《Bioinformatics (Oxford, England)》2001,17(Z1):S56-S64

We describe DAGGER, an ab initio gene recognition program which combines the output of high dimensional signal sensors in an intuitive gene model based on directed acyclic graphs. In the first stage, candidate start, donor, acceptor, and stop sites are scored using the SNoW learning architecture. These sites are then used to generate a directed acyclic graph in which each source-sink path represents a possible gene structure. Training sequences are used to optimize an edge weighting function so that the shortest source-sink path maximizes exon-level prediction accuracy. Experimental evaluation of prediction accuracy on two benchmark data sets demonstrates that DAGGERis competitive with ab initio gene finding programs based on Hidden Markov Models. 相似文献

12.

Distance-based analysis of variance for brain connectivity

Russell T. Shinohara Haochang Shou Marco Carone Robert Schultz Birkan Tunc Drew Parker Melissa Lynne Martin Ragini Verma 《Biometrics》2020,76(1):257-269

The field of neuroimaging dedicated to mapping connections in the brain is increasingly being recognized as key for understanding neurodevelopment and pathology. Networks of these connections are quantitatively represented using complex structures, including matrices, functions, and graphs, which require specialized statistical techniques for estimation and inference about developmental and disorder-related changes. Unfortunately, classical statistical testing procedures are not well suited to high-dimensional testing problems. In the context of global or regional tests for differences in neuroimaging data, traditional analysis of variance (ANOVA) is not directly applicable without first summarizing the data into univariate or low-dimensional features, a process that might mask the salient features of high-dimensional distributions. In this work, we consider a general framework for two-sample testing of complex structures by studying generalized within-group and between-group variances based on distances between complex and potentially high-dimensional observations. We derive an asymptotic approximation to the null distribution of the ANOVA test statistic, and conduct simulation studies with scalar and graph outcomes to study finite sample properties of the test. Finally, we apply our test to our motivating study of structural connectivity in autism spectrum disorder. 相似文献

13.

Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research 总被引：36，自引：0，他引：36

Conesa A Götz S García-Gómez JM Terol J Talón M Robles M 《Bioinformatics (Oxford, England)》2005,21(18):3674-3676

SUMMARY: We present here Blast2GO (B2G), a research tool designed with the main purpose of enabling Gene Ontology (GO) based data mining on sequence data for which no GO annotation is yet available. B2G joints in one application GO annotation based on similarity searches with statistical analysis and highlighted visualization on directed acyclic graphs. This tool offers a suitable platform for functional genomics research in non-model species. B2G is an intuitive and interactive desktop application that allows monitoring and comprehension of the whole annotation and analysis process. AVAILABILITY: Blast2GO is freely available via Java Web Start at http://www.blast2go.de. SUPPLEMENTARY MATERIAL: http://www.blast2go.de -> Evaluation. 相似文献

14.

PathFinder: reconstruction and dynamic visualization of metabolic pathways

Goesmann A Haubrock M Meyer F Kalinowski J Giegerich R 《Bioinformatics (Oxford, England)》2002,18(1):124-129

MOTIVATION: Beyond methods for a gene-wise annotation and analysis of sequenced genomes new automated methods for functional analysis on a higher level are needed. The identification of realized metabolic pathways provides valuable information on gene expression and regulation. Detection of incomplete pathways helps to improve a constantly evolving genome annotation or discover alternative biochemical pathways. To utilize automated genome analysis on the level of metabolic pathways new methods for the dynamic representation and visualization of pathways are needed. RESULTS: PathFinder is a tool for the dynamic visualization of metabolic pathways based on annotation data. Pathways are represented as directed acyclic graphs, graph layout algorithms accomplish the dynamic drawing and visualization of the metabolic maps. A more detailed analysis of the input data on the level of biochemical pathways helps to identify genes and detect improper parts of annotations. As an Relational Database Management System (RDBMS) based internet application PathFinder reads a list of EC-numbers or a given annotation in EMBL- or Genbank-format and dynamically generates pathway graphs. 相似文献

15.

Information indices with high discriminative power for graphs

Dehmer M Grabner M Varmuza K 《PloS one》2012,7(2):e31214

相似文献

16.

Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications 总被引：1，自引：0，他引：1

Yu H Jansen R Stolovitzky G Gerstein M 《Bioinformatics (Oxford, England)》2007,23(16):2163-2173

MOTIVATION: Many classifications of protein function such as Gene Ontology (GO) are organized in directed acyclic graph (DAG) structures. In these classifications, the proteins are terminal leaf nodes; the categories 'above' them are functional annotations at various levels of specialization and the computation of a numerical measure of relatedness between two arbitrary proteins is an important proteomics problem. Moreover, analogous problems are important in other contexts in large-scale information organization--e.g. the Wikipedia online encyclopedia and the Yahoo and DMOZ web page classification schemes. RESULTS: Here we develop a simple probabilistic approach for computing this relatedness quantity, which we call the total ancestry method. Our measure is based on counting the number of leaf nodes that share exactly the same set of 'higher up' category nodes in comparison to the total number of classified pairs (i.e. the chance for the same total ancestry). We show such a measure is associated with a power-law distribution, allowing for the quick assessment of the statistical significance of shared functional annotations. We formally compare it with other quantitative functional similarity measures (such as, shortest path within a DAG, lowest common ancestor shared and Azuaje's information-theoretic similarity) and provide concrete metrics to assess differences. Finally, we provide a practical implementation for our total ancestry measure for GO and the MIPS functional catalog and give two applications of it in specific functional genomics contexts. AVAILABILITY: The implementations and results are available through our supplementary website at: http://gersteinlab.org/proj/funcsim. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. 相似文献

17.

Detecting higher-order interactions among the spiking events in a group of neurons

L. Martignon H. Von Hassein S. Grün A. Aertsen G. Palm 《Biological cybernetics》1995,73(1):69-81

相似文献

18.

Incorporating single-locus tests into haplotype cladistic analysis in case-control studies

下载免费PDF全文

Liu J Papasian C Deng HW 《PLoS genetics》2007,3(3):e46

In case-control studies, genetic associations for complex diseases may be probed either with single-locus tests or with haplotype-based tests. Although there are different views on the relative merits and preferences of the two test strategies, haplotype-based analyses are generally believed to be more powerful to detect genes with modest effects. However, a main drawback of haplotype-based association tests is the large number of distinct haplotypes, which increases the degrees of freedom for corresponding test statistics and thus reduces the statistical power. To decrease the degrees of freedom and enhance the efficiency and power of haplotype analysis, we propose an improved haplotype clustering method that is based on the haplotype cladistic analysis developed by Durrant et al. In our method, we attempt to combine the strengths of single-locus analysis and haplotype-based analysis into one single test framework. Novel in our method is that we develop a more informative haplotype similarity measurement by using p-values obtained from single-locus association tests to construct a measure of weight, which to some extent incorporates the information of disease outcomes. The weights are then used in computation of similarity measures to construct distance metrics between haplotype pairs in haplotype cladistic analysis. To assess our proposed new method, we performed simulation analyses to compare the relative performances of (1) conventional haplotype-based analysis using original haplotype, (2) single-locus allele-based analysis, (3) original haplotype cladistic analysis (CLADHC) by Durrant et al., and (4) our weighted haplotype cladistic analysis method, under different scenarios. Our weighted cladistic analysis method shows an increased statistical power and robustness, compared with the methods of haplotype cladistic analysis, single-locus test, and the traditional haplotype-based analyses. The real data analyses also show that our proposed method has practical significance in the human genetics field. 相似文献

19.

Modeling bursty transcription and splicing with the chemical master equation

《Biophysical journal》2022,121(6):1056-1069

相似文献

20.

Reconstructing pedigrees: some identifiability questions for a recombination-mutation model

Bhalchandra D. Thatte 《Journal of mathematical biology》2013,66(1-2):37-74

Pedigrees are directed acyclic graphs that represent ancestral relationships between individuals in a population. Based on a schematic recombination process, we describe two simple Markov models for sequences evolving on pedigrees—Model R (recombinations without mutations) and Model RM (recombinations with mutations). For these models, we ask an identifiability question: is it possible to construct a pedigree from the joint probability distribution of extant sequences? We present partial identifiability results for general pedigrees: we show that when the crossover probabilities are sufficiently small, certain spanning subgraph sequences can be counted from the joint distribution of extant sequences. We demonstrate how pedigrees that earlier seemed difficult to distinguish are distinguished by counting their spanning subgraph sequences. 相似文献