期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Identifying chromosomal fragile sites from individuals: a multinomial statistical model

Udo Böhm P. Frederick Dahm Bryant F. McAllister Ira F. Greenbaum 《Human genetics》1995,95(3):249-256

The inability to identify fragile sites from data for single individuals remains the major obstacle to determining whether these chromosomal loci are predisposed to cancer-causing and evolutionary rearrangements. We describe a novel statistical model that is amenable to data from single individuals and that establishes site-specific chromosomal breakage as nonrandom with respect to the distribution of total breakage. Our method tests incrementally smaller subsets of the data for homogeneity under a multinomial model that assigns equal probabilites to a maximal set of nonfragile sites and unrestricted probabilities to the remaining fragile sites with significantly higher numbers of breaks. We show how standardized Pearson's chi-square (X²) and likelihood-ratio (G²) statistics can be appropriately used to measure goodness-of-fit for sparse contingency (individual-based) data in this model. A sample application of this approach indicates extensive variation in fragile sites among individuals and marked differences in fragile-site inferences from pooled as opposed to per-individual data. 相似文献

2.

Identifying biological concepts from a protein-related corpus with a probabilistic topic model 总被引：1，自引：0，他引：1

Bin Zheng David C McLean Xinghua Lu 《BMC bioinformatics》2006,7(1):58

Background

Biomedical literature, e.g., MEDLINE, contains a wealth of knowledge regarding functions of proteins. Major recurring biological concepts within such text corpora represent the domains of this body of knowledge. The goal of this research is to identify the major biological topics/concepts from a corpus of protein-related MEDLINE^? titles and abstracts by applying a probabilistic topic model. 相似文献

3.

Inherited disorder phenotypes: controlled annotation and statistical analysis for knowledge mining from gene lists

Masseroli M Galati O Manzotti M Gibert K Pinciroli F 《BMC bioinformatics》2005,6(Z4):S18

相似文献

4.

Correlating overrepresented upstream motifs to gene expression: a computational approach to regulatory element discovery in eukaryotes

Michele Caselle Ferdinando Di Cunto Paolo Provero 《BMC bioinformatics》2002,3(1):7-10

相似文献

5.

Incorporating gene networks into statistical tests for genomic data via a spatially correlated mixture model 总被引：1，自引：0，他引：1

Wei P Pan W 《Bioinformatics (Oxford, England)》2008,24(3):404-411

相似文献

6.

Regmex: a statistical tool for exploring motifs in ranked sequence lists from genomics experiments

Morten Muhlig Nielsen Paula Tataru Tobias Madsen Asger Hobolth Jakob Skou Pedersen 《Algorithms for molecular biology : AMB》2018,13(1):17

Background

Motif analysis methods have long been central for studying biological function of nucleotide sequences. Functional genomics experiments extend their potential. They typically generate sequence lists ranked by an experimentally acquired functional property such as gene expression or protein binding affinity. Current motif discovery tools suffer from limitations in searching large motif spaces, and thus more complex motifs may not be included. There is thus a need for motif analysis methods that are tailored for analyzing specific complex motifs motivated by biological questions and hypotheses rather than acting as a screen based motif finding tool.

Methods

We present Regmex (REGular expression Motif EXplorer), which offers several methods to identify overrepresented motifs in ranked lists of sequences. Regmex uses regular expressions to define motifs or families of motifs and embedded Markov models to calculate exact p-values for motif observations in sequences. Biases in motif distributions across ranked sequence lists are evaluated using random walks, Brownian bridges, or modified rank based statistics. A modular setup and fast analytic p value evaluations make Regmex applicable to diverse and potentially large-scale motif analysis problems.

Results

We demonstrate use cases of combined motifs on simulated data and on expression data from micro RNA transfection experiments. We confirm previously obtained results and demonstrate the usability of Regmex to test a specific hypothesis about the relative location of microRNA seed sites and U-rich motifs. We further compare the tool with an existing motif discovery tool and show increased sensitivity.

Conclusions

Regmex is a useful and flexible tool to analyze motif hypotheses that relates to large data sets in functional genomics. The method is available as an R package (https://github.com/muhligs/regmex).

相似文献

7.

Correction: A linear classifier based on entity recognition tools and a statistical approach to method extraction in the protein-protein interaction literature

Anália Louren?o Michael Conover Andrew Wong Azadeh Nematzadeh Fengxia Pan Hagit Shatkay Luis M Rocha 《BMC bioinformatics》2012,13(1):1-3

Correction to A. Louren?o, M. Conover, A. Wong, A. Nematzadeh, F. Pan, H. Shatkay, and L.M. Rocha."A Linear Classifier Based on Entity Recognition Tools and a Statistical Approach to Method Extraction in the Protein-Protein Interaction Literature". BMC Bioinformatics 2011, 12(Suppl 8):S12. doi:http://10.1186/1471-2105-12-S8-S12. 相似文献

8.

GenCLiP: a software program for clustering gene lists by literature profiling and constructing gene co-occurrence networks related to custom keywords

Zhong-Xi Huang Hui-Yong Tian Zhen-Fu Hu Yi-Bo Zhou Jin Zhao Kai-Tai Yao 《BMC bioinformatics》2008,9(1):308

Background

Biomedical researchers often want to explore pathogenesis and pathways regulated by abnormally expressed genes, such as those identified by microarray analyses. Literature mining is an important way to assist in this task. Many literature mining tools are now available. However, few of them allows the user to make manual adjustments to zero in on what he/she wants to know in particular. 相似文献

9.

Selective isolation of mycobacteria from soil: a statistical analysis approach 总被引：10，自引：0，他引：10

F Portaels A De Muynck M P Sylla 《Journal of general microbiology》1988,134(3):849-855

We compared four decontamination methods for the isolation of mycobacteria from soil specimens. Different media were used: L?wenstein-Jensen, Ogawa and various modified Ogawa media. Statistical analysis demonstrated that the best results (low contamination and high positivity rates) were obtained when the specimens were incubated in trypticase soy broth, treated with solutions containing malachite green and cycloheximide, then decontaminated with sodium hydroxide and inoculated onto Ogawa media. The lowest contamination rates were obtained with Ogawa medium containing 500 micrograms cycloheximide ml-1. The use of these techniques is proposed for the isolation of mycobacteria from heavily contaminated clinical specimens as well as from soil. 相似文献

10.

Identifying differential expression in multiple SAGE libraries: an overdispersed log-linear model approach

Jun?Lu John?K?Tomfohr Thomas?B?Kepler Email author 《BMC bioinformatics》2005,6(1):165

Background

In testing for differential gene expression involving multiple serial analysis of gene expression (SAGE) libraries, it is critical to account for both between and within library variation. Several methods have been proposed, including the t test, t _wtest, and an overdispersed logistic regression approach. The merits of these tests, however, have not been fully evaluated. Questions still remain on whether further improvements can be made. 相似文献

11.

Ranking candidate disease genes from gene expression and protein interaction: a Katz-centrality based approach

Zhao J Yang TH Huang Y Holme P 《PloS one》2011,6(9):e24306

Many diseases have complex genetic causes, where a set of alleles can affect the propensity of getting the disease. The identification of such disease genes is important to understand the mechanistic and evolutionary aspects of pathogenesis, improve diagnosis and treatment of the disease, and aid in drug discovery. Current genetic studies typically identify chromosomal regions associated specific diseases. But picking out an unknown disease gene from hundreds of candidates located on the same genomic interval is still challenging. In this study, we propose an approach to prioritize candidate genes by integrating data of gene expression level, protein-protein interaction strength and known disease genes. Our method is based only on two, simple, biologically motivated assumptions--that a gene is a good disease-gene candidate if it is differentially expressed in cases and controls, or that it is close to other disease-gene candidates in its protein interaction network. We tested our method on 40 diseases in 58 gene expression datasets of the NCBI Gene Expression Omnibus database. On these datasets our method is able to predict unknown disease genes as well as identifying pleiotropic genes involved in the physiological cellular processes of many diseases. Our study not only provides an effective algorithm for prioritizing candidate disease genes but is also a way to discover phenotypic interdependency, cooccurrence and shared pathophysiology between different disorders. 相似文献

12.

A neural network model of metaphor understanding with dynamic interaction based on a statistical language analysis: targeting a human-like model

Terai A Nakagawa M 《International journal of neural systems》2007,17(4):265-274

The purpose of this paper is to construct a model that represents the human process of understanding metaphors, focusing specifically on similes of the form an "A like B". Generally speaking, human beings are able to generate and understand many sorts of metaphors. This study constructs the model based on a probabilistic knowledge structure for concepts which is computed from a statistical analysis of a large-scale corpus. Consequently, this model is able to cover the many kinds of metaphors that human beings can generate. Moreover, the model implements the dynamic process of metaphor understanding by using a neural network with dynamic interactions. Finally, the validity of the model is confirmed by comparing model simulations with the results from a psychological experiment. 相似文献

13.

Motif Yggdrasil: sampling sequence motifs from a tree mixture model.

Samuel A Andersson Jens Lagergren 《Journal of computational biology》2007,14(5):682-697

In phylogenetic foot-printing, putative regulatory elements are found in upstream regions of orthologous genes by searching for common motifs. Motifs in different upstream sequences are subject to mutations along the edges of the corresponding phylogenetic tree, consequently taking advantage of the tree in the motif search is an appealing idea. We describe the Motif Yggdrasil sampler; the first Gibbs sampler based on a general tree that uses unaligned sequences. Previous tree-based Gibbs samplers have assumed a star-shaped tree or partially aligned upstream regions. We give a probabilistic model (MY model) describing upstream sequences with regulatory elements and build a Gibbs sampler with respect to this model. The model allows toggling, i.e., the restriction of a position to a subset of nucleotides, but does not require aligned sequences nor edge lengths, which may be difficult to come by. We apply the collapsing technique to eliminate the need to sample nuisance parameters, and give a derivation of the predictive update formula. We show that the MY model improves the modeling of difficult motif instances and that the use of the tree achieves a substantial increase in nucleotide level correlation coefficient both for synthetic data and 37 bacterial lexA genes. We investigate the sensitivity to errors in the tree and show that using random trees MY sampler still has a performance similar to the original version. 相似文献

14.

GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists

下载免费PDF全文

Carmona-Saez P Chagoyen M Tirado F Carazo JM Pascual-Montano A 《Genome biology》2007,8(1):R3

We present GENECODIS, a web-based tool that integrates different sources of information to search for annotations that frequently co-occur in a set of genes and rank them by statistical significance. The analysis of concurrent annotations provides significant information for the biologic interpretation of high-throughput experiments and may outperform the results of standard methods for the functional analysis of gene lists. GENECODIS is publicly available at . 相似文献

15.

Modeling pH effects on microbial growth: a statistical thermodynamic approach 总被引：2，自引：0，他引：2

Tan Y Wang ZX Marshall KC 《Biotechnology and bioengineering》1998,59(6):724-731

This paper applies a statistical thermodynamic approach to the kinetics of microbial growth influenced by pH. A general equation is developed and shown to provide a good theoretical basis for the existing pH models that have been widely used to describe the effects of pH on microbial growth kinetics. Four experimental data sets are used to test the general equation developed. The four data sets exhibited a variety of functional curve shapes, for example, symmetrical and asymmetrical bell-shaped, when the specific growth rate of microorganisms is plotted as a function of pH. All four data sets are found to be well represented by the general equation. The existing pH model was, however, found to represent only one out of four data sets, i.e., the symmetrical case. 相似文献

16.

Habitat size and number in multi-habitat landscapes: a model approach based on species-area curves 总被引：1，自引：0，他引：1

Even Tjrve 《Ecography》2002,25(1):17-24

This paper discusses species diversity in simple multi-habitat environments. Its main purpose is to present simple mathematical and graphical models on how landscape patterns affect species numbers. The idea is to build models of species diversity in multi-habitat landscapes by combining species-area curves for different habitats. Predictions are made about how variables such as species richness and species overlap between habitats influence the proportion of the total landscape each habitat should constitute, and how many habitats it should be divided into in order to be able to sustain the maximal number of species. Habitat size and numbers are the only factors discussed here, not habitat spatial patterns. Among the predictions are: 1) where there are differences in species diversity between habitats, optimal landscape patterns contain larger proportions of species rich habitats. 2) Species overlap between habitats shifts the optimum further towards larger proportions of species rich habitat types. 3) Species overlap also shifts the optimum towards fewer habitat types. 4) Species diversity in landscapes with large species overlap is more resistant to changes in landscape (or reserve) size. This type of model approach can produce theories useful to nature and landscape management in general, and the design of nature reserves and national parks in particular. 相似文献

17.

Identifying natural grouping structure in gelada baboons: a network approach

Pádraig Mac Carron R.I.M. Dunbar 《Animal behaviour》2016

相似文献

18.

Identifying heterogeneous anisotropic properties in cerebral aneurysms: a pointwise approach

Xuefeng Zhao Madhavan L. Raghavan Jia Lu 《Biomechanics and modeling in mechanobiology》2011,10(2):177-189

The traditional approaches of estimating heterogeneous properties in a soft tissue structure using optimization-based inverse methods often face difficulties because of the large number of unknowns to be simultaneously determined. This article proposes a new method for identifying the heterogeneous anisotropic nonlinear elastic properties in cerebral aneurysms. In this method, the local properties are determined directly from the pointwise stress–strain data, thus avoiding the need for simultaneously optimizing for the property values at all points/regions in the aneurysm. The stress distributions needed for a pointwise identification are computed using an inverse elastostatic method without invoking the material properties in question. This paradigm is tested numerically through simulated inflation tests on an image-based cerebral aneurysm sac. The wall tissue is modeled as an eight-ply laminate whose constitutive behavior is described by an anisotropic hyperelastic strain energy function containing four parameters. The parameters are assumed to vary continuously in the sac. Deformed configurations generated from forward finite element analysis are taken as input to inversely establish the parameter distributions. The delineated and the assigned distributions are in excellent agreement. A forward verification is conducted by comparing the displacement solutions obtained from the delineated and the assigned material parameters at a different pressure. The deviations in nodal displacements are found to be within 0.2% in most part of the sac. The study highlights some distinct features of the proposed method, and demonstrates the feasibility of organ level identification of the distributive anisotropic nonlinear properties in cerebral aneurysms. 相似文献

19.

Radial arrangement of chromosome territories in human cell nuclei: a computer model approach based on gene density indicates a probabilistic global positioning code

下载免费PDF全文

Kreth G Finsterle J von Hase J Cremer M Cremer C 《Biophysical journal》2004,86(5):2803-2812

Numerous investigations in the last years focused on chromosome arrangements in interphase nuclei. Recent experiments concerning the radial positioning of chromosomes in the nuclear volume of human and primate lymphocyte cells suggest a relationship between the gene density of a chromosome territory (CT) and its distance to the nuclear center. To relate chromosome positioning and gene density in a quantitative way, computer simulations of whole human cell nuclear genomes of normal karyotype were performed on the basis of the spherical 1 Mbp chromatin domain model and the latest data about sequence length and gene density of chromosomes. Three different basic assumptions about the initial distribution of chromosomes were used: a statistical, a deterministic, and a probabilistic initial distribution. After a simulated decondensation in early G1, a comparison of the radial distributions of simulated and experimentally obtained data for CTs Nos. 12, 18, 19, and 20 was made. It was shown that the experimentally observed distributions can be fitted better assuming an initial probabilistic distribution. This supports the concept of a probabilistic global gene positioning code depending on CT sequence length and gene density. 相似文献

20.

Explicating hypergonadotropism in postmenopausal women: a statistical model

Keenan DM Veldhuis JD 《American journal of physiology. Regulatory, integrative and comparative physiology》2000,278(5):R1247-R1257

Neurohormone secretion is viewed here as a variable (unknown) admixture of basal and pulsatile release mechanisms, convolved with individually fitted biexponential elimination kinetics. This construct allows maximum-likelihood estimates of both (regulated and constitutive) components of hormone secretion. Thereby, we infer that a prolonged slow-component half-life of gonadotropin removal and amplified pulsatile (and total) daily luteinizing hormone (LH) secretion rates jointly explicate the postmenopausal elevation in serum LH concentrations without a necessary rise in basal LH secretion rates. This biomathematical formulation should be useful in exploring other neuroregulatory mechanisms that underlie single or dual alterations in the basal versus pulsatile modes of hormone secretion. 相似文献