共查询到20条相似文献,搜索用时 15 毫秒
1.
The limiting spacial correlations are derived for a population of neutral alleles migrating among K colonies. The allelic population is modeled as a subcritical branching process and the limiting correlations are obtained conditional on nonextinction of the population. 相似文献
2.
3.
The present work is aimed at developing the mathematical tools by which the dynamics of gene amplification (GA) can be described in detail. Some discrete compartmental models of GA by disproportionate replication and a general model for other putative GA mechanisms are presented and analyzed. The dynamical distribution of gene copy number in the cell population is calculated with the loss of cells taken either as constant or as copy-number-dependent. Our analysis shows that for a one-copy GA process with constant loss of cells, the relative frequency of single-gene-copy cells (sensitive cells) converges to zero, with the rate of convergence depending on the amplification probability. In contrast, for a one-copy GA process with copy-number-dependent loss of cells, the relative frequency of single-copy cells is bounded, implying a bounded compartment of many-gene-copy cells. Using branching processes theory we calculate the dynamical distribution of the single-gene-copy compartment as well as its extinction probability. Our models are used for estimating treatment prognosis as affected by drug resistance due to GA, showing significant differences in prognosis resulting from small changes in drug dose. 相似文献
4.
This paper is concerned with the properties of a stochastic integral which arises in the study of a modified Markov branching process. Explicit expressions are found for the mean and the limit distribution of the integral. 相似文献
5.
Background
The discovery and mapping of genomic variants is an essential step in most analysis done using sequencing reads. There are a number of mature software packages and associated pipelines that can identify single nucleotide polymorphisms (SNPs) with a high degree of concordance. However, the same cannot be said for tools that are used to identify the other types of variants. Indels represent the second most frequent class of variants in the human genome, after single nucleotide polymorphisms. The reliable detection of indels is still a challenging problem, especially for variants that are longer than a few bases.Results
We have developed a set of algorithms and heuristics collectively called indelMINER to identify indels from whole genome resequencing datasets using paired-end reads. indelMINER uses a split-read approach to identify the precise breakpoints for indels of size less than a user specified threshold, and supplements that with a paired-end approach to identify larger variants that are frequently missed with the split-read approach. We use simulated and real datasets to show that an implementation of the algorithm performs favorably when compared to several existing tools.Conclusions
indelMINER can be used effectively to identify indels in whole-genome resequencing projects. The output is provided in the VCF format along with additional information about the variant, including information about its presence or absence in another sample. The source code and documentation for indelMINER can be freely downloaded from www.bx.psu.edu/miller_lab/indelMINER.tar.gz.Electronic supplementary material
The online version of this article (doi:10.1186/s12859-015-0483-6) contains supplementary material, which is available to authorized users. 相似文献6.
7.
L A Solberg 《International journal of cell cloning》1990,8(4):283-290
The purpose of this paper is to describe a model of megakaryocytopoiesis as a branching process with stochastic processes regulating critical control points of differentiation along the stem cell megakaryocyte platelet axis. Progress of cells through these critical control points are regulated by transitional probabilities, which in turn are regulated by influences such as growth factors. The critical control points include transition of resting megakaryocytic stem cells (CFU-meg) into proliferating stem cells, the cessation of cytokinesis, and the cessation of DNA synthesis. A computerized computational method has been developed for directly fitting the stochastic branching model to colony growth data. The computational model has allowed transitional probabilities to be derived from colony size data. The model provides a unifying explanation for much of the heterogeneity of stages of maturation within populations of megakaryocytes and is fully compatible with historical data supporting the stochastic nature of hematopoietic stem cell regulation and with modern molecular concepts about control of the cell cycle. 相似文献
8.
Determining the quality and complexity of next-generation sequencing data without a reference genome
Seyed Yahya Anvar Lusine Khachatryan Martijn Vermaat Michiel van Galen Irina Pulyakhina Yavuz Ariyurek Ken Kraaijeveld Johan T den Dunnen Peter de Knijff Peter AC ’t Hoen Jeroen FJ Laros 《Genome biology》2014,15(12)
We describe an open-source kPAL package that facilitates an alignment-free assessment of the quality and comparability of sequencing datasets by analyzing k-mer frequencies. We show that kPAL can detect technical artefacts such as high duplication rates, library chimeras, contamination and differences in library preparation protocols. kPAL also successfully captures the complexity and diversity of microbiomes and provides a powerful means to study changes in microbial communities. Together, these features make kPAL an attractive and broadly applicable tool to determine the quality and comparability of sequence libraries even in the absence of a reference sequence. kPAL is freely available at https://github.com/LUMC/kPAL.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-014-0555-3) contains supplementary material, which is available to authorized users. 相似文献9.
Summary Next‐generation sequencing technologies are poised to revolutionize the field of biomedical research. The increased resolution of these data promise to provide a greater understanding of the molecular processes that control the morphology and behavior of a cell. However, the increased amounts of data require innovative statistical procedures that are powerful while still being computationally feasible. In this article, we present a method for identifying small RNA molecules, called miRNAs, which regulate genes by targeting their mRNAs for degradation or translational repression. In the first step of our modeling procedure, we apply an innovative dynamic linear model that identifies candidate miRNA genes in high‐throughput sequencing data. The model is flexible and can accurately identify interesting biological features while accounting for both the read count, read spacing, and sequencing depth. Additionally, miRNA candidates are also processed using a modified Smith–Waterman sequence alignment that scores the regions for potential RNA hairpins, one of the defining features of miRNAs. We illustrate our method on simulated datasets as well as on a small RNA Caenorhabditis elegans dataset from the Illumina sequencing platform. These examples show that our method is highly sensitive for identifying known and novel miRNA genes. 相似文献
10.
The major role played by environmental factors in determining the geographical range sizes of species raises the possibility of describing their long-term dynamics in relatively simple terms, a goal which has hitherto proved elusive. Here we develop a stochastic differential equation to describe the dynamics of the range size of an individual species based on the relationship between abundance and range size, derive a limiting stationary probability model to quantify the stochastic nature of the range size for that species at steady state, and then generalize this model to the species-range size distribution for an assemblage. The model fits well to several empirical datasets of the geographical range sizes of species in taxonomic assemblages, and provides the simplest explanation of species-range size distributions to date. 相似文献
11.
Stochastic growth processes abound in the biology of parasitism, and one mathematical tool that is particularly well suited for describing such phenomena is the Galton-Watson branching process. Introduced more than a century ago to settle a debate over the rate of disappearance of surnames in the British peerage, branching processes are applied today in fields as diverse as quantum physics and theoretical computer science. In this article, Dale Taneyhill, Alison Dunn and Melanie Hatcher provide a simple introduction to branching processes, and demonstrate their uses in quantitative parasitology. 相似文献
12.
13.
Background
The investigation of plant genome structure and evolution requires comprehensive characterization of repetitive sequences that make up the majority of higher plant nuclear DNA. Since genome-wide characterization of repetitive elements is complicated by their high abundance and diversity, novel approaches based on massively-parallel sequencing are being adapted to facilitate the analysis. It has recently been demonstrated that the low-pass genome sequencing provided by a single 454 sequencing reaction is sufficient to capture information about all major repeat families, thus providing the opportunity for efficient repeat investigation in a wide range of species. However, the development of appropriate data mining tools is required in order to fully utilize this sequencing data for repeat characterization. 相似文献14.
This paper is aimed at exhibiting two striking features of the usual approach of emotional expression in science and philosophy, suggesting a different perspective. One is the generally shared belief that emotions are a state of utter disarray, which hampers objective knowledge; the other is the search for causal explanation, along a wide range of categorized approaches (psychology, neurosciences, developmental biology) each proposing its own theoretical framework. In both cases the result is to play down emotional expression. Alternatively, we propose to view emotions as something crucial in the choice of our conceptual tools, ideas and involvements, in the genesis of which various explanations interact in a complex stochastic way. Rather than being a harmful disruption of the mind calling for identification of a definite causality, emotional behaviour appears as a necessary process in cognition, which is irreducible to a unique origin. 相似文献
15.
16.
PCR permits the exponential and sequence-specific amplification of DNA, even from minute starting quantities. PCR is a fundamental step in preparing DNA samples for high-throughput sequencing. However, there are errors associated with PCR-mediated amplification. Here we examine the effects of four important sources of error—bias, stochasticity, template switches and polymerase errors—on sequence representation in low-input next-generation sequencing libraries. We designed a pool of diverse PCR amplicons with a defined structure, and then used Illumina sequencing to search for signatures of each process. We further developed quantitative models for each process, and compared predictions of these models to our experimental data. We find that PCR stochasticity is the major force skewing sequence representation after amplification of a pool of unique DNA amplicons. Polymerase errors become very common in later cycles of PCR but have little impact on the overall sequence distribution as they are confined to small copy numbers. PCR template switches are rare and confined to low copy numbers. Our results provide a theoretical basis for removing distortions from high-throughput sequencing data. In addition, our findings on PCR stochasticity will have particular relevance to quantification of results from single cell sequencing, in which sequences are represented by only one or a few molecules. 相似文献
17.
18.
Rogers S Girolami M Campbell C Breitling R 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2005,2(2):143-156
We present a new computational technique (a software implementation, data sets, and supplementary information are available at http://www.enm.bris.ac.uk/lpd/) which enables the probabilistic analysis of cDNA microarray data and we demonstrate its effectiveness in identifying features of biomedical importance. A hierarchical Bayesian model, called Latent Process Decomposition (LPD), is introduced in which each sample in the data set is represented as a combinatorial mixture over a finite set of latent processes, which are expected to correspond to biological processes. Parameters in the model are estimated using efficient variational methods. This type of probabilistic model is most appropriate for the interpretation of measurement data generated by cDNA microarray technology. For determining informative substructure in such data sets, the proposed model has several important advantages over the standard use of dendrograms. First, the ability to objectively assess the optimal number of sample clusters. Second, the ability to represent samples and gene expression levels using a common set of latent variables (dendrograms cluster samples and gene expression values separately which amounts to two distinct reduced space representations). Third, in constrast to standard cluster models, observations are not assigned to a single cluster and, thus, for example, gene expression levels are modeled via combinations of the latent processes identified by the algorithm. We show this new method compares favorably with alternative cluster analysis methods. To illustrate its potential, we apply the proposed technique to several microarray data sets for cancer. For these data sets it successfully decomposes the data into known subtypes and indicates possible further taxonomic subdivision in addition to highlighting, in a wholly unsupervised manner, the importance of certain genes which are known to be medically significant. To illustrate its wider applicability, we also illustrate its performance on a microarray data set for yeast. 相似文献
19.
The first North American RAD Sequencing and Genomics Symposium, sponsored by Floragenex (http://www.floragenex.com/radmeeting/), took place in Portland, Oregon (USA) on 19 April 2011. This symposium was convened to promote and discuss the use of restriction-site-associated DNA (RAD) sequencing technologies. RAD sequencing is one of several strategies recently developed to increase the power of data generated via short-read sequencing technologies by reducing their complexity (Baird et al. 2008; Huang et al. 2009; Andolfatto et al. 2011; Elshire et al. 2011). RAD sequencing, as a form of genotyping by sequencing, has been effectively applied in genetic mapping and quantitative trait loci (QTL) analyses in a range of organisms including nonmodel, genetically highly heterogeneous organisms (Table 1; Baird et al. 2008; Baxter et al. 2011; Chutimanitsakun et al. 2011; Pfender et al. 2011). RAD sequencing has recently found applications in phylogeography (Emerson et al. 2010) and population genomics (Hohenlohe et al. 2010). Considering the diversity of talks presented during this meeting, more developments are to be expected in the very near future. 相似文献
20.
A new evolutionary model with hereditary modes considered as correlated fluctuations of fertility has been proposed. It has been demonstrated that the model allows the global statistical properties of the system to be evaluated, e.g. the ensemble average and the probability of extinction. The results obtained show the increase of instability of a population with the enhancement of inheritance efficiency. The existence of at least an exponential stratification in the population has also been shown. Possible applications of the present model are discussed. 相似文献