首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 8 毫秒
1.
2.
Inference of the insulin secretion rate (ISR) from C-peptide measurements as a quantification of pancreatic β-cell function is clinically important in diseases related to reduced insulin sensitivity and insulin action. ISR derived from C-peptide concentration is an example of nonparametric Bayesian model selection where a proposed ISR time-course is considered to be a "model". An inferred value of inaccessible continuous variables from discrete observable data is often problematic in biology and medicine, because it is a priori unclear how robust the inference is to the deletion of data points, and a closely related question, how much smoothness or continuity the data actually support. Predictions weighted by the posterior distribution can be cast as functional integrals as used in statistical field theory. Functional integrals are generally difficult to evaluate, especially for nonanalytic constraints such as positivity of the estimated parameters. We propose a computationally tractable method that uses the exact solution of an associated likelihood function as a prior probability distribution for a Markov-chain Monte Carlo evaluation of the posterior for the full model. As a concrete application of our method, we calculate the ISR from actual clinical C-peptide measurements in human subjects with varying degrees of insulin sensitivity. Our method demonstrates the feasibility of functional integral Bayesian model selection as a practical method for such data-driven inference, allowing the data to determine the smoothing timescale and the width of the prior probability distribution on the space of models. In particular, our model comparison method determines the discrete time-step for interpolation of the unobservable continuous variable that is supported by the data. Attempts to go to finer discrete time-steps lead to less likely models.  相似文献   

3.
We investigated the ability of several principal components analysis (PCA)-based strategies to detect and control for population stratification using data from a multi-center study of epithelial ovarian cancer among women of European-American ethnicity. These include a correction based on an ancestry informative markers (AIMs) panel designed to capture European ancestral variation and corrections utilizing un-thinned genome-wide SNP data; case-control samples were drawn from four geographically distinct North-American sites. The AIMs-only and genome-wide first principal components (PC1) both corresponded to the previously described North or Northwest-Southeast axis of European variation. We found that the genome-wide PCA captured this primary dimension of variation more precisely and identified additional axes of genome-wide variation of relevance to epithelial ovarian cancer. Associations evident between the genome-wide PCs and study site corroborate North American immigration history and suggest that undiscovered dimensions of variation lie within Northern Europe. The structure captured by the genome-wide PCA was also found within control individuals and did not reflect the case-control variation present in the data. The genome-wide PCA highlighted three regions of local LD, corresponding to the lactase (LCT) gene on chromosome 2, the human leukocyte antigen system (HLA) on chromosome 6 and to a common inversion polymorphism on chromosome 8. These features did not compromise the efficacy of PCs from this analysis for ancestry control. This study concludes that although AIMs panels are a cost-effective way of capturing population structure, genome-wide data should preferably be used when available.  相似文献   

4.
5.
The genetic analysis of quantitative traits in humans is changing as a result of the availability of whole-genome SNP data. Heritability analysis can make use of actual genetic sharing between pairs of individuals estimated from the genotype data, rather than the expected genetic sharing implied by their family relationship. This could provide more accurate heritability estimates and help to overcome the equal environment assumption. Quantitative trait locus (QTL) linkage mapping can make use of local genetic sharing inferred from very dense local genotype data from pedigree members or individuals not previously known to be related. This approach may be particularly suited for detecting loci that contain rare variants with major effect on the phenotype. Finally, whole-genome SNP data can be used to measure the genetic similarity between individuals to provide matched sets for association studies, in order to avoid spurious association from population stratification.  相似文献   

6.
Estimation of pairwise correlation from incomplete and replicated molecular profiling data is an ubiquitous problem in pattern discovery analysis, such as clustering and networking. However, existing methods solve this problem by ad hoc data imputation, followed by aveGation coefficient type approaches, which might annihilate important patterns present in the molecular profiling data. Moreover, these approaches do not consider and exploit the underlying experimental design information that specifies the replication mechanisms. We develop an Expectation-Maximization (EM) type algorithm to estimate the correlation structure using incomplete and replicated molecular profiling data with a priori known replication mechanism. The approach is sufficiently generalized to be applicable to any known replication mechanism. In case of unknown replication mechanism, it is reduced to the parsimonious model introduced previously. The efficacy of our approach was first evaluated by comprehensively comparing various bivariate and multivariate imputation approaches using simulation studies. Results from real-world data analysis further confirmed the superior performance of the proposed approach to the commonly used approaches, where we assessed the robustness of the method using data sets with up to 30 percent missing values.  相似文献   

7.
An extraordinarily large number of single nucleotide polymorphisms (SNPs) are now available in humans as well as in other model organisms. Technological advancements may soon make it feasible to assay hundreds of SNPs in virtually any organism of interest. One potential application of SNPs is the determination of pairwise genetic relationships in populations without known pedigrees. Although microsatellites are currently the marker of choice for this purpose, the number of independently segregating microsatellite markers that can be feasibly assayed is limited. Thus, it can be difficult to distinguish reliably some classes of relationship (e.g. full-sibs from half-sibs) with microsatellite data alone. We assess, via Monte Carlo computer simulation, the potential for using a large panel of independently segregating SNPs to infer genetic relationships, following the analytical approach of Blouin et al. (1996). We have explored a 'best case scenario' in which 100 independently segregating SNPs are available. For discrimination among single-generation relationships or for the identification of parent-offspring pairs, it appears that such a panel of moderately polymorphic SNPs (minor allele frequency of 0.20) will provide discrimination power equivalent to only 16-20 independently segregating microsatellites. Although newly available analytical methods that can account for tight genetic linkage between markers will, in theory, allow improved estimation of relationships using thousands of SNPs in highly dense genomic scans, in practice such studies will only be feasible in a handful of model organisms. Given the comparable amount of effort required for the development of both types of markers, it seems that microsatellites will remain the marker of choice for relationship estimation in nonmodel organisms, at least for the foreseeable future.  相似文献   

8.
Microarray expression profiles are inherently noisy and many different sources of variation exist in microarray experiments. It is still a significant challenge to develop stochastic models to realize noise in microarray expression profiles, which has profound influence on the reverse engineering of genetic regulation. Using the target genes of the tumour suppressor gene p53 as the test problem, we developed stochastic differential equation models and established the relationship between the noise strength of stochastic models and parameters of an error model for describing the distribution of the microarray measurements. Numerical results indicate that the simulated variance from stochastic models with a stochastic degradation process can be represented by a monomial in terms of the hybridization intensity and the order of the monomial depends on the type of stochastic process. The developed stochastic models with multiple stochastic processes generated simulations whose variance is consistent with the prediction of the error model. This work also established a general method to develop stochastic models from experimental information.  相似文献   

9.
DNA微阵列技术可同时定量测定成千上万个基因在生物样本中的表达水平,从这一技术获得的全基因组范围表达数据为揭示基因间复杂调控关系提供了可能。研究人员试图通过数学和计算方法来构建遗传互作的模型,这些基因调控网络模型有聚类法、布尔网络、贝叶斯网络、微分方程等。文章对网络重建计算方法的研究现状进行了较为全面的综述,比较了不同模型的优缺点,并对该领域进一步的研究趋势进行了展望。  相似文献   

10.
11.
12.
13.
In systematics, parsimony methods construct phylogenies, or evolutionary trees, in which characters evolve with the least evolutionary change. The chromosome inversion, or polymorphism, parsimony criterion is used when each character of a population may exhibit homozygous or heterozygous states, but when the heterozygous state must evolve uniquely. Variations of the criterion concern whether or not the ancestral states of characters are specified. We establish that problems of inferring phylogenies by these criteria are NP-complete and thus are so difficult computationally that efficient optimal algorithms for them are unlikely to exist.  相似文献   

14.
Repetitive-element PCR (rep-PCR) is a method for genotyping bacteria based on the selective amplification of repetitive genetic elements dispersed throughout bacterial chromosomes. The method has great potential for large-scale epidemiological studies because of its speed and simplicity; however, objective guidelines for inferring relationships among bacterial isolates from rep-PCR data are lacking. We used multilocus sequence typing (MLST) as a "gold standard" to optimize the analytical parameters for inferring relationships among Escherichia coli isolates from rep-PCR data. We chose 12 isolates from a large database to represent a wide range of pairwise genetic distances, based on the initial evaluation of their rep-PCR fingerprints. We conducted MLST with these same isolates and systematically varied the analytical parameters to maximize the correspondence between the relationships inferred from rep-PCR and those inferred from MLST. Methods that compared the shapes of densitometric profiles ("curve-based" methods) yielded consistently higher correspondence values between data types than did methods that calculated indices of similarity based on shared and different bands (maximum correspondences of 84.5% and 80.3%, respectively). Curve-based methods were also markedly more robust in accommodating variations in user-specified analytical parameter values than were "band-sharing coefficient" methods, and they enhanced the reproducibility of rep-PCR. Phylogenetic analyses of rep-PCR data yielded trees with high topological correspondence to trees based on MLST and high statistical support for major clades. These results indicate that rep-PCR yields accurate information for inferring relationships among E. coli isolates and that accuracy can be enhanced with the use of analytical methods that consider the shapes of densitometric profiles.  相似文献   

15.
Many expression array experiments monitor gene activity as an organism goes through some biological process. It is desirable to find genes with similar expression patterns in the resulting time series data. We propose a new simulation approach that assesses the statistical significance of similarity scores between expression patterns. The simulation takes into account the dependence between columns of data.  相似文献   

16.
Tumor samples are typically heterogeneous, containing admixture by normal, non-cancerous cells and one or more subpopulations of cancerous cells. Whole-genome sequencing of a tumor sample yields reads from this mixture, but does not directly reveal the cell of origin for each read. We introduce THetA (Tumor Heterogeneity Analysis), an algorithm that infers the most likely collection of genomes and their proportions in a sample, for the case where copy number aberrations distinguish subpopulations. THetA successfully estimates normal admixture and recovers clonal and subclonal copy number aberrations in real and simulated sequencing data. THetA is available at http://compbio.cs.brown.edu/software/.  相似文献   

17.
Positive and negative associations between species are a key outcome of community assembly from regional species pools. These associations are difficult to detect and can be caused by a range of processes such as species interactions, local environmental constraints and dispersal. We integrate new ideas around species distribution modeling, covariance matrix estimation, and network analysis to provide an approach to inferring non‐random species associations from local‐ and regional‐scale occurrence data. Specifically, we provide a novel framework for identifying species associations that overcomes three challenges: 1) correcting for indirect effects from other species, 2) avoiding spurious associations driven by regional‐scale distributions, and 3) describing these associations in a multi‐species context. We highlight a range of research questions and analyses that this framework is able to address. We show that the approach is statistically robust using simulated data. In addition, we present an empirical analysis of > 1000 North American tree communities that gives evidence for weak positive associations among small groups of species. Finally, we discuss several possible extensions for identifying drivers of associations, predicting community assembly, and better linking biogeography and community ecology.  相似文献   

18.
19.
20.

Background

Human genome sequencing has enabled the association of phenotypes with genetic loci, but our ability to effectively translate this data to the clinic has not kept pace. Over the past 60 years, pharmaceutical companies have successfully demonstrated the safety and efficacy of over 1,200 novel therapeutic drugs via costly clinical studies. While this process must continue, better use can be made of the existing valuable data. In silico tools such as candidate gene prediction systems allow rapid identification of disease genes by identifying the most probable candidate genes linked to genetic markers of the disease or phenotype under investigation. Integration of drug-target data with candidate gene prediction systems can identify novel phenotypes which may benefit from current therapeutics. Such a drug repositioning tool can save valuable time and money spent on preclinical studies and phase I clinical trials.

Methods

We previously used Gentrepid (http://www.gentrepid.org) as a platform to predict 1,497 candidate genes for the seven complex diseases considered in the Wellcome Trust Case-Control Consortium genome-wide association study; namely Type 2 Diabetes, Bipolar Disorder, Crohn's Disease, Hypertension, Type 1 Diabetes, Coronary Artery Disease and Rheumatoid Arthritis. Here, we adopted a simple approach to integrate drug data from three publicly available drug databases: the Therapeutic Target Database, the Pharmacogenomics Knowledgebase and DrugBank; with candidate gene predictions from Gentrepid at the systems level.

Results

Using the publicly available drug databases as sources of drug-target association data, we identified a total of 428 candidate genes as novel therapeutic targets for the seven phenotypes of interest, and 2,130 drugs feasible for repositioning against the predicted novel targets.

Conclusions

By integrating genetic, bioinformatic and drug data, we have demonstrated that currently available drugs may be repositioned as novel therapeutics for the seven diseases studied here, quickly taking advantage of prior work in pharmaceutics to translate ground-breaking results in genetics to clinical treatments.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号