首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Point 1: The ecological models of Alfred J. Lotka and Vito Volterra have had an enormous impact on ecology over the past century. Some of the earliest—and clearest—experimental tests of these models were famously conducted by Georgy Gause in the 1930s. Although well known, the data from these experiments are not widely available and are often difficult to analyze using standard statistical and computational tools.Point 2: Here, we introduce the gauseR package, a collection of tools for fitting Lotka‐Volterra models to time series data of one or more species. The package includes several methods for parameter estimation and optimization, and includes 42 datasets from Gause''s species interaction experiments and related work. Additionally, we include with this paper a short blog post discussing the historical importance of these data and models, and an R vignette with a walk‐through introducing the package methods. The package is available for download at github.com/adamtclark/gauseR.Point 3: To demonstrate the package, we apply it to several classic experimental studies from Gause, as well as two other well‐known datasets on multi‐trophic dynamics on Isle Royale, and in spatially structured mite populations. In almost all cases, models fit observations closely and fitted parameter values make ecological sense.Point 4: Taken together, we hope that the methods, data, and analyses that we present here provide a simple and user‐friendly way to interact with complex ecological data. We are optimistic that these methods will be especially useful to students and educators who are studying ecological dynamics, as well as researchers who would like a fast tool for basic analyses.  相似文献   

2.
Full factorial breeding designs are useful for quantifying the amount of additive genetic, nonadditive genetic, and maternal variance that explain phenotypic traits. Such variance estimates are important for examining evolutionary potential. Traditionally, full factorial mating designs have been analyzed using a two‐way analysis of variance, which may produce negative variance values and is not suited for unbalanced designs. Mixed‐effects models do not produce negative variance values and are suited for unbalanced designs. However, extracting the variance components, calculating significance values, and estimating confidence intervals and/or power values for the components are not straightforward using traditional analytic methods. We introduce fullfact – an R package that addresses these issues and facilitates the analysis of full factorial mating designs with mixed‐effects models. Here, we summarize the functions of the fullfact package. The observed data functions extract the variance explained by random and fixed effects and provide their significance. We then calculate the additive genetic, nonadditive genetic, and maternal variance components explaining the phenotype. In particular, we integrate nonnormal error structures for estimating these components for nonnormal data types. The resampled data functions are used to produce bootstrap‐t confidence intervals, which can then be plotted using a simple function. We explore the fullfact package through a worked example. This package will facilitate the analyses of full factorial mating designs in R, especially for the analysis of binary, proportion, and/or count data types and for the ability to incorporate additional random and fixed effects and power analyses.  相似文献   

3.
  1. Neighborhood competition models are powerful tools to measure the effect of interspecific competition. Statistical methods to ease the application of these models are currently lacking.
  2. We present the forestecology package providing methods to (a) specify neighborhood competition models, (b) evaluate the effect of competitor species identity using permutation tests, and (cs) measure model performance using spatial cross‐validation. Following Allen and Kim (PLoS One, 15, 2020, e0229930), we implement a Bayesian linear regression neighborhood competition model.
  3. We demonstrate the package''s functionality using data from the Smithsonian Conservation Biology Institute''s large forest dynamics plot, part of the ForestGEO global network of research sites. Given ForestGEO’s data collection protocols and data formatting standards, the package was designed with cross‐site compatibility in mind. We highlight the importance of spatial cross‐validation when interpreting model results.
  4. The package features (a) tidyverse‐like structure whereby verb‐named functions can be modularly “piped” in sequence, (b) functions with standardized inputs/outputs of simple features sf package class, and (c) an S3 object‐oriented implementation of the Bayesian linear regression model. These three facts allow for clear articulation of all the steps in the sequence of analysis and easy wrangling and visualization of the geospatial data. Furthermore, while the package only has Bayesian linear regression implemented, the package was designed with extensibility to other methods in mind.
  相似文献   

4.
Bird harvest for recreational purposes or as a source for food is an important activity worldwide. Assessing or mitigating the impact of these additional sources of mortality on bird populations is therefore crucial issue. The sustainability of harvest levels is however rarely documented, because knowledge of their population dynamics remains rudimentary for many bird species. Some helpful approaches using limited demographic data can be used to provide initial assessment of the sustainable use of harvested bird populations, and help adjusting harvest levels accordingly. The Demographic Invariant Method (DIM) is used to detect overharvesting. In complement, the Potential Take Level (PTL) approach may allow setting a level of take with regard to management objectives and/or to assess whether current harvest levels meet these objectives. Here, we present the R package popharvest that implements these two approaches in a simple and straightforward way. The package provides users with a set of flexible functions whose arguments can be adapted to existing knowledge about population dynamics. Also, popharvest enables users to test scenarios or propagate uncertainty in demographic parameters to the assessment of sustainability through easily programming Monte Carlo simulations. The simplicity of the package makes it a useful toolbox for wildlife managers or policymakers. This paper provides them with backgrounds about the DIM and PTL approaches and illustrates the use of popharvest''s functionalities in this context.  相似文献   

5.
6.
Structural biology experiments and structure prediction tools have provided many high-resolution three-dimensional structures of nucleic acids. Also, molecular dynamics force field parameters have been adapted to simulating charged and flexible nucleic acid structures on microsecond time scales. Therefore, we can generate the dynamics of DNA or RNA molecules, but we still lack adequate tools for the analysis of the resulting huge amounts of data. We present MINT (Motif Identifier for Nucleic acids Trajectory) — an automatic tool for analyzing three-dimensional structures of RNA and DNA, and their full-atom molecular dynamics trajectories or other conformation sets (e.g. X-ray or nuclear magnetic resonance-derived structures). For each RNA or DNA conformation MINT determines the hydrogen bonding network resolving the base pairing patterns, identifies secondary structure motifs (helices, junctions, loops, etc.) and pseudoknots. MINT also estimates the energy of stacking and phosphate anion-base interactions. For many conformations, as in a molecular dynamics trajectory, MINT provides averages of the above structural and energetic features and their evolution. We show MINT functionality based on all-atom explicit solvent molecular dynamics trajectory of the 30S ribosomal subunit.  相似文献   

7.
8.
Enrichment analysis of gene sets is a popular approach that provides a functional interpretation of genome-wide expression data. Existing tests are affected by inter-gene correlations, resulting in a high Type I error. The most widely used test, Gene Set Enrichment Analysis, relies on computationally intensive permutations of sample labels to generate a null distribution that preserves gene–gene correlations. A more recent approach, CAMERA, attempts to correct for these correlations by estimating a variance inflation factor directly from the data. Although these methods generate P-values for detecting gene set activity, they are unable to produce confidence intervals or allow for post hoc comparisons. We have developed a new computational framework for Quantitative Set Analysis of Gene Expression (QuSAGE). QuSAGE accounts for inter-gene correlations, improves the estimation of the variance inflation factor and, rather than evaluating the deviation from a null hypothesis with a P-value, it quantifies gene-set activity with a complete probability density function. From this probability density function, P-values and confidence intervals can be extracted and post hoc analysis can be carried out while maintaining statistical traceability. Compared with Gene Set Enrichment Analysis and CAMERA, QuSAGE exhibits better sensitivity and specificity on real data profiling the response to interferon therapy (in chronic Hepatitis C virus patients) and Influenza A virus infection. QuSAGE is available as an R package, which includes the core functions for the method as well as functions to plot and visualize the results.  相似文献   

9.
Recent studies have revealed multiple mechanisms that can lead to heterogeneity in ribosomal composition. This heterogeneity can lead to preferential translation of specific panels of mRNAs, and is defined in large part by the ribosomal protein (RP) content, amongst other things. However, it is currently unknown to what extent ribosomal composition is heterogeneous across tissues, which is compounded by a lack of tools available to study it. Here we present dripARF, a method for detecting differential RP incorporation into the ribosome using Ribosome Profiling (Ribo-seq) data. We combine the ‘waste’ rRNA fragment data generated in Ribo-seq with the known 3D structure of the human ribosome to predict differences in the composition of ribosomes in the material being studied. We have validated this approach using publicly available data, and have revealed a potential role for eS25/RPS25 in development. Our results indicate that ribosome heterogeneity can be detected in Ribo-seq data, providing a new method to study this phenomenon. Furthermore, with dripARF, previously published Ribo-seq data provides a wealth of new information, allowing the identification of RPs of interest in many disease and normal contexts. dripARF is available as part of the ARF R package and can be accessed through https://github.com/fallerlab/ARF.  相似文献   

10.
11.
Gene set analysis using biological pathways has become a widely used statistical approach for gene expression analysis. A biological pathway can be represented through a graph where genes and their interactions are, respectively, nodes and edges of the graph. From a biological point of view only some portions of a pathway are expected to be altered; however, few methods using pathway topology have been proposed and none of them tries to identify the signal paths, within a pathway, mostly involved in the biological problem. Here, we present a novel algorithm for pathway analysis clipper, that tries to fill in this gap. clipper implements a two-step empirical approach based on the exploitation of graph decomposition into a junction tree to reconstruct the most relevant signal path. In the first step clipper selects significant pathways according to statistical tests on the means and the concentration matrices of the graphs derived from pathway topologies. Then, it identifies within these pathways the signal paths having the greatest association with a specific phenotype. We test our approach on simulated and two real expression datasets. Our results demonstrate the efficacy of clipper in the identification of signal transduction paths totally coherent with the biological problem.  相似文献   

12.
The complete mitochondrial genome (mitogenome) of Cerura menciana (Lepidoptera: Notodontidae) was sequenced and analyzed in this study. The mitogenome is a circular molecule of 15,369 bp, containing 13 protein-coding genes (PCGs), two ribosomal RNA (rRNA) genes, 22 transfer RNA (tRNA) genes and a A+T-rich region. The positive AT skew (0.031) indicated that more As than Ts were present. All PCGs were initiated by ATN codons, except for the cytochrome c oxidase subunit 1 (cox1) gene, which was initiated by CAG. Two of the 13 PCGs contained the incomplete termination codon T or TA, while the others were terminated with the stop codon TAA. The A+T-rich region was 372 bp in length and consisted of an ‘ATAGA’ motif followed by an 18 bp poly-T stretch, a microsatellite-like (AT)8 and a poly-A element upstream of the trnM gene. Results examining codon usage indicated that Asn, Ile, Leu2, Lys, Tyr and Phe were the six most frequently occurring amino acids, while Cys was the rarest. Phylogenetic relationships, analyzed based on the nucleotide sequences of the 13 PCGs from other insect mitogenomes, confirmed that C. menciana belongs to the Notodontidae family.  相似文献   

13.
The rapidly expanding body of available genomic and protein structural data provides a rich resource for understanding protein dynamics with biomolecular simulation. While computational infrastructure has grown rapidly, simulations on an omics scale are not yet widespread, primarily because software infrastructure to enable simulations at this scale has not kept pace. It should now be possible to study protein dynamics across entire (super)families, exploiting both available structural biology data and conformational similarities across homologous proteins. Here, we present a new tool for enabling high-throughput simulation in the genomics era. Ensembler takes any set of sequences—from a single sequence to an entire superfamily—and shepherds them through various stages of modeling and refinement to produce simulation-ready structures. This includes comparative modeling to all relevant PDB structures (which may span multiple conformational states of interest), reconstruction of missing loops, addition of missing atoms, culling of nearly identical structures, assignment of appropriate protonation states, solvation in explicit solvent, and refinement and filtering with molecular simulation to ensure stable simulation. The output of this pipeline is an ensemble of structures ready for subsequent molecular simulations using computer clusters, supercomputers, or distributed computing projects like Folding@home. Ensembler thus automates much of the time-consuming process of preparing protein models suitable for simulation, while allowing scalability up to entire superfamilies. A particular advantage of this approach can be found in the construction of kinetic models of conformational dynamics—such as Markov state models (MSMs)—which benefit from a diverse array of initial configurations that span the accessible conformational states to aid sampling. We demonstrate the power of this approach by constructing models for all catalytic domains in the human tyrosine kinase family, using all available kinase catalytic domain structures from any organism as structural templates. Ensembler is free and open source software licensed under the GNU General Public License (GPL) v2. It is compatible with Linux and OS X. The latest release can be installed via the conda package manager, and the latest source can be downloaded from https://github.com/choderalab/ensembler.  相似文献   

14.
Time course ‘omics’ experiments are becoming increasingly important to study system-wide dynamic regulation. Despite their high information content, analysis remains challenging. ‘Omics’ technologies capture quantitative measurements on tens of thousands of molecules. Therefore, in a time course ‘omics’ experiment molecules are measured for multiple subjects over multiple time points. This results in a large, high-dimensional dataset, which requires computationally efficient approaches for statistical analysis. Moreover, methods need to be able to handle missing values and various levels of noise. We present a novel, robust and powerful framework to analyze time course ‘omics’ data that consists of three stages: quality assessment and filtering, profile modelling, and analysis. The first step consists of removing molecules for which expression or abundance is highly variable over time. The second step models each molecular expression profile in a linear mixed model framework which takes into account subject-specific variability. The best model is selected through a serial model selection approach and results in dimension reduction of the time course data. The final step includes two types of analysis of the modelled trajectories, namely, clustering analysis to identify groups of correlated profiles over time, and differential expression analysis to identify profiles which differ over time and/or between treatment groups. Through simulation studies we demonstrate the high sensitivity and specificity of our approach for differential expression analysis. We then illustrate how our framework can bring novel insights on two time course ‘omics’ studies in breast cancer and kidney rejection. The methods are publicly available, implemented in the R CRAN package lmms.  相似文献   

15.
In biomedical studies the patients are often evaluated numerous times and a large number of variables are recorded at each time-point. Data entry and manipulation of longitudinal data can be performed using spreadsheet programs, which usually include some data plotting and analysis capabilities and are straightforward to use, but are not designed for the analyses of complex longitudinal data. Specialized statistical software offers more flexibility and capabilities, but first time users with biomedical background often find its use difficult. We developed medplot, an interactive web application that simplifies the exploration and analysis of longitudinal data. The application can be used to summarize, visualize and analyze data by researchers that are not familiar with statistical programs and whose knowledge of statistics is limited. The summary tools produce publication-ready tables and graphs. The analysis tools include features that are seldom available in spreadsheet software, such as correction for multiple testing, repeated measurement analyses and flexible non-linear modeling of the association of the numerical variables with the outcome. medplot is freely available and open source, it has an intuitive graphical user interface (GUI), it is accessible via the Internet and can be used within a web browser, without the need for installing and maintaining programs locally on the user’s computer. This paper describes the application and gives detailed examples describing how to use the application on real data from a clinical study including patients with early Lyme borreliosis.  相似文献   

16.
Climate-growth relationships are usually analysed using monthly climate data. The dendroTools R package also provides methodological approaches that enable climate-growth analysis for daily climate data. Such analysis reveals more complete climate signal patterns. In this article, new functions of the dendroTools R package are presented. Partial correlation coefficients are now implemented and can be used to calculate the strength of a linear relationship between two variables, while controlling for a third variable. Bootstrapped correlations can then be used to provide insights into the confidence intervals of statistical estimates. The calculation of partial and bootstrapped correlations is available for daily and monthly data. Finally, data transformation, S3 generic plotting and summary functions are also presented here.  相似文献   

17.
18.
Apolipoprotein B (APOB) and Adiponectin Receptor 1 (ADIPOR1) are related to the regulation of feed intake, fat metabolism and protein deposition and are candidate genes for genomic studies in birds. In this study, associations of two single nucleotide polymorphisms (SNPs) g.102 A>T (APOB) and g.729 C>T (ADIPOR1) with carcass, bone integrity and performance traits in broilers were investigated. Genotyping was performed on a paternal line of 1,454 broilers. The SNP detection was carried out by PCR-RFLP technique using the restriction enzymes HhaI for the SNP g.729 C>T and MslI for the SNP g.102 A>T. The association analyses of the two SNPs with 85 traits were performed using the restricted maximum likelihood (REML) and Generalized Quasi-Likelihood Score (GQLS) methods. For REML the model included the random additive genetic effect of animal and fixed effects of sex, hatch and SNP genotypes. In the GQLS method, a logistic regression was used to associate the genotypes with phenotypes adjusted for fixed effects of sex and hatch. The SNP g.729 C>T in the ADIPOR1 gene was associated with thickness of the femur and breast skin yield. Thus, the ADIPOR1 gene seems implicated in the metabolism and/or fat deposition and bone integrity in broilers.  相似文献   

19.
In this work we develop a novel algorithm for reconstructing the genomes of ancestral individuals, given genotype or sequence data from contemporary individuals and an extended pedigree of family relationships. A pedigree with complete genomes for every individual enables the study of allele frequency dynamics and haplotype diversity across generations, including deviations from neutrality such as transmission distortion. When studying heritable diseases, ancestral haplotypes can be used to augment genome-wide association studies and track disease inheritance patterns. The building blocks of our reconstruction algorithm are segments of Identity-By-Descent (IBD) shared between two or more genotyped individuals. The method alternates between identifying a source for each IBD segment and assembling IBD segments placed within each ancestral individual. Unlike previous approaches, our method is able to accommodate complex pedigree structures with hundreds of individuals genotyped at millions of SNPs.We apply our method to an Old Order Amish pedigree from Lancaster, Pennsylvania, whose founders came to North America from Europe during the early 18th century. The pedigree includes 1338 individuals from the past 12 generations, 394 with genotype data. The motivation for reconstruction is to understand the genetic basis of diseases segregating in the family through tracking haplotype transmission over time. Using our algorithm thread, we are able to reconstruct an average of 224 ancestral individuals per chromosome. For these ancestral individuals, on average we reconstruct 79% of their haplotypes. We also identify a region on chromosome 16 that is difficult to reconstruct—we find that this region harbors a short Amish-specific copy number variation and the gene HYDIN. thread was developed for endogamous populations, but can be applied to any extensive pedigree with the recent generations genotyped. We anticipate that this type of practical ancestral reconstruction will become more common and necessary to understand rare and complex heritable diseases in extended families.  相似文献   

20.
A new series of programs for non-parametric tests have been inserted in SPBS, a statistical package for biological sciences that applies biostatistical methods using microcomputer software [Comput. Prog. Biomed. 14 (1982) 7–20]. Programs presented here cover non-parametric tests for multiple comparisons between two or more groups of paired or independent data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号