共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
3.
Motif识别是计算生物学中的重要问题.处理缺失数据的方法被大家广泛应用于生物序列中的Motif识别,例如EM算法,Gibbs抽样等等.现在识别Motif的方法都是首先假定Motif的长度是给的,但是,事实上Motif的长度是未知的,在这篇文章中,我们用Gibbs抽样算法在寻找Motif的位置的同时确定Motif的长度. 相似文献
4.
We have examined the stability of duplicated DNA sequences in the sexual phase of the life cycle of the basidiomycete fungus, Coprinus cinereus. We observed premeiotic de novo methylation in haploid nuclei containing either a triplication, a tandem duplication, or an ectopic duplication. Methylation changes were not observed in unique sequences. Repeated sequences underwent methylation changes during the dikaryotic stage. In one cross, 27% of the segregants exhibited methylation-directed gene inactivation. However, all auxotrophs eventually reverted to prototrophy. C to T transition mutations were not observed in this study. Our studies also revealed one inversion that occurred in 50% of the segregants in a single triplication cross, and a single pop-out event that occurred during vegetative growth. These alterations were similar to changes reported in experiments with duplicated sequences in Neurospora crassa and Ascobolus immersus. However, significant differences were also noted. First, the extent of methylation was much less in C. cinereus than in the other two fungi. Second, CpG sequences appeared to be the preferred targets of methylation. 相似文献
5.
6.
Testing for random mating of a population is important in population genetics, because deviations from randomness of mating may indicate inbreeding, population stratification, natural selection, or sampling bias. However, current methods use only observed numbers of genotypes and alleles, and do not take advantage of the fact that the advent of sequencing technology provides an opportunity to investigate this topic in unprecedented detail. To address this opportunity, a novel statistical test for random mating is required in population genomics studies for which large sequencing datasets are generally available. Here, we propose a Monte-Carlo-based-permutation test (MCP) as an approach to detect random mating. Computer simulations used to evaluate the performance of the permutation test indicate that its type I error is well controlled and that its statistical power is greater than that of the commonly used chi-square test (CHI). Our simulation study shows the power of our test is greater for datasets characterized by lower levels of migration between subpopulations. In addition, test power increases with increasing recombination rate, sample size, and divergence time of subpopulations. For populations exhibiting limited migration and having average levels of population divergence, the statistical power approaches 1 for sequences longer than 1Mbp and for samples of 400 individuals or more. Taken together, our results suggest that our permutation test is a valuable tool to detect random mating of populations, especially in population genomics studies. 相似文献
7.
Two methods of computing Monte Carlo estimators of variance components using restricted maximum likelihood via the expectation-maximisation algorithm are reviewed. A third approach is suggested and the performance of the methods is compared using simulated data. 相似文献
8.
A popular way to represent clustered binary, count, or other data is via the generalized linear mixed model framework, which accommodates correlation through incorporation of random effects. A standard assumption is that the random effects follow a parametric family such as the normal distribution; however, this may be unrealistic or too restrictive to represent the data. We relax this assumption and require only that the distribution of random effects belong to a class of 'smooth' densities and approximate the density by the seminonparametric (SNP) approach of Gallant and Nychka (1987). This representation allows the density to be skewed, multi-modal, fat- or thin-tailed relative to the normal and includes the normal as a special case. Because an efficient algorithm to sample from an SNP density is available, we propose a Monte Carlo EM algorithm using a rejection sampling scheme to estimate the fixed parameters of the linear predictor, variance components and the SNP density. The approach is illustrated by application to a data set and via simulation. 相似文献
9.
10.
11.
12.
Abstract The Detailed Balance Energy-scaled Displacement Monte Carlo method that stems from the previously published Energy Scaled Displacement Monte Carlo method is presented. The results of tests performed on a dense Lennard-Jones liquid and on two particles in one dimension are reported. 相似文献
13.
14.
Elizabeth E. Palmer Seungbeom Hong Fatema Al Zahrani Mais O. Hashem Fajr A. Aleisa Heba M. Jalal Ahmed Tejaswi Kandula Rebecca Macintosh Andre E. Minoche Clare Puttick Velimir Gayevskiy Alexander P. Drew Mark J. Cowley Marcel Dinger Jill A. Rosenfeld Rui Xiao Megan T. Cho Suliat F. Yakubu Stefan T. Arold 《American journal of human genetics》2019,104(3):542-552
15.
Abstract A modified grand canonical ensemble Monte Carlo (GCMC) technique has been developed to simulate adsorption isotherms for molecules on or near a surface. The speed and accuracy of the simulation is increased by using a non-uniform distribution function, related to the force field exerted by the surface and the current configuration, to generate coordinates for the creation of new particles in the simulation. With this method, isotherms are generated more efficiently than with current techniques in which the creation step relies on a uniform distribution to generate the coordinates of a new molecule. This is shown by comparing the calculation of an isotherm for a simple molecule adsorbed on a graphite substrate from a traditional GCMC simulation with that calculated using this new technique. 相似文献
16.
Bingshan Li Wei Chen Xiaowei Zhan Fabio Busonero Serena Sanna Carlo Sidore Francesco Cucca Hyun M. Kang Gon?alo R. Abecasis 《PLoS genetics》2012,8(10)
Family samples, which can be enriched for rare causal variants by focusing on families with multiple extreme individuals and which facilitate detection of de novo mutation events, provide an attractive resource for next-generation sequencing studies. Here, we describe, implement, and evaluate a likelihood-based framework for analysis of next generation sequence data in family samples. Our framework is able to identify variant sites accurately and to assign individual genotypes, and can handle de novo mutation events, increasing the sensitivity and specificity of variant calling and de novo mutation detection. Through simulations we show explicit modeling of family relationships is especially useful for analyses of low-frequency variants and that genotype accuracy increases with the number of individuals sequenced per family. Compared with the standard approach of ignoring relatedness, our methods identify and accurately genotype more variants, and have high specificity for detecting de novo mutation events. The improvement in accuracy using our methods over the standard approach is particularly pronounced for low-frequency variants. Furthermore the family-aware calling framework dramatically reduces Mendelian inconsistencies and is beneficial for family-based analysis. We hope our framework and software will facilitate continuing efforts to identify genetic factors underlying human diseases. 相似文献
17.
18.
19.
A Monte Carlo method for Bayesian inference in frailty models 总被引:3,自引:0,他引:3
D G Clayton 《Biometrics》1991,47(2):467-485
Many analyses in epidemiological and prognostic studies and in studies of event history data require methods that allow for unobserved covariates or "frailties." Clayton and Cuzick (1985, Journal of the Royal Statistical Society, Series A 148, 82-117) proposed a generalization of the proportional hazards model that implemented such random effects, but the proof of the asymptotic properties of the method remains elusive, and practical experience suggests that the likelihoods may be markedly nonquadratic. This paper sets out a Bayesian representation of the model in the spirit of Kalbfleisch (1978, Journal of the Royal Statistical Society, Series B 40, 214-221) and discusses inference using Monte Carlo methods. 相似文献
20.
SLAF-seq: An Efficient Method of Large-Scale De Novo SNP Discovery and Genotyping Using High-Throughput Sequencing 总被引:2,自引:0,他引:2
Xiaowen Sun Dongyuan Liu Xiaofeng Zhang Wenbin Li Hui Liu Weiguo Hong Chuanbei Jiang Ning Guan Chouxian Ma Huaping Zeng Chunhua Xu Jun Song Long Huang Chunmei Wang Junjie Shi Rui Wang Xianhu Zheng Cuiyun Lu Xiaowu Wang Hongkun Zheng 《PloS one》2013,8(3)
Large-scale genotyping plays an important role in genetic association studies. It has provided new opportunities for gene discovery, especially when combined with high-throughput sequencing technologies. Here, we report an efficient solution for large-scale genotyping. We call it specific-locus amplified fragment sequencing (SLAF-seq). SLAF-seq technology has several distinguishing characteristics: i) deep sequencing to ensure genotyping accuracy; ii) reduced representation strategy to reduce sequencing costs; iii) pre-designed reduced representation scheme to optimize marker efficiency; and iv) double barcode system for large populations. In this study, we tested the efficiency of SLAF-seq on rice and soybean data. Both sets of results showed strong consistency between predicted and practical SLAFs and considerable genotyping accuracy. We also report the highest density genetic map yet created for any organism without a reference genome sequence, common carp in this case, using SLAF-seq data. We detected 50,530 high-quality SLAFs with 13,291 SNPs genotyped in 211 individual carp. The genetic map contained 5,885 markers with 0.68 cM intervals on average. A comparative genomics study between common carp genetic map and zebrafish genome sequence map showed high-quality SLAF-seq genotyping results. SLAF-seq provides a high-resolution strategy for large-scale genotyping and can be generally applicable to various species and populations. 相似文献