首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
3.
Motif识别是计算生物学中的重要问题.处理缺失数据的方法被大家广泛应用于生物序列中的Motif识别,例如EM算法,Gibbs抽样等等.现在识别Motif的方法都是首先假定Motif的长度是给的,但是,事实上Motif的长度是未知的,在这篇文章中,我们用Gibbs抽样算法在寻找Motif的位置的同时确定Motif的长度.  相似文献   

4.
Summary In epidemics of infectious diseases such as influenza, an individual may have one of four possible final states: prior immune, escaped from infection, infected with symptoms, and infected asymptomatically. The exact state is often not observed. In addition, the unobserved transmission times of asymptomatic infections further complicate analysis. Under the assumption of missing at random, data‐augmentation techniques can be used to integrate out such uncertainties. We adapt an importance‐sampling‐based Monte Carlo Expectation‐Maximization (MCEM) algorithm to the setting of an infectious disease transmitted in close contact groups. Assuming the independence between close contact groups, we propose a hybrid EM‐MCEM algorithm that applies the MCEM or the traditional EM algorithms to each close contact group depending on the dimension of missing data in that group, and discuss the variance estimation for this practice. In addition, we propose a bootstrap approach to assess the total Monte Carlo error and factor that error into the variance estimation. The proposed methods are evaluated using simulation studies. We use the hybrid EM‐MCEM algorithm to analyze two influenza epidemics in the late 1970s to assess the effects of age and preseason antibody levels on the transmissibility and pathogenicity of the viruses.  相似文献   

5.
De Novo Methylation of Repeated Sequences in Coprinus Cinereus   总被引:5,自引:2,他引:3       下载免费PDF全文
T. Freedman  P. J. Pukkila 《Genetics》1993,135(2):357-366
We have examined the stability of duplicated DNA sequences in the sexual phase of the life cycle of the basidiomycete fungus, Coprinus cinereus. We observed premeiotic de novo methylation in haploid nuclei containing either a triplication, a tandem duplication, or an ectopic duplication. Methylation changes were not observed in unique sequences. Repeated sequences underwent methylation changes during the dikaryotic stage. In one cross, 27% of the segregants exhibited methylation-directed gene inactivation. However, all auxotrophs eventually reverted to prototrophy. C to T transition mutations were not observed in this study. Our studies also revealed one inversion that occurred in 50% of the segregants in a single triplication cross, and a single pop-out event that occurred during vegetative growth. These alterations were similar to changes reported in experiments with duplicated sequences in Neurospora crassa and Ascobolus immersus. However, significant differences were also noted. First, the extent of methylation was much less in C. cinereus than in the other two fungi. Second, CpG sequences appeared to be the preferred targets of methylation.  相似文献   

6.
7.
Ibrahim JG  Chen MH  Lipsitz SR 《Biometrics》1999,55(2):591-596
We propose a method for estimating parameters for general parametric regression models with an arbitrary number of missing covariates. We allow any pattern of missing data and assume that the missing data mechanism is ignorable throughout. When the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM algorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). We extend this method to continuous or mixed categorical and continuous covariates, and for arbitrary parametric regression models, by adapting a Monte Carlo version of the EM algorithm as discussed by Wei and Tanner (1990, Journal of the American Statistical Association 85, 699-704). In addition, we discuss the Gibbs sampler for sampling from the conditional distribution of the missing covariates given the observed data and show that the appropriate complete conditionals are log-concave. The log-concavity property of the conditional distributions will facilitate a straightforward implementation of the Gibbs sampler via the adaptive rejection algorithm of Gilks and Wild (1992, Applied Statistics 41, 337-348). We assume the model for the response given the covariates is an arbitrary parametric regression model, such as a generalized linear model, a parametric survival model, or a nonlinear model. We model the marginal distribution of the covariates as a product of one-dimensional conditional distributions. This allows us a great deal of flexibility in modeling the distribution of the covariates and reduces the number of nuisance parameters that are introduced in the E-step. We present examples involving both simulated and real data.  相似文献   

8.
Testing for random mating of a population is important in population genetics, because deviations from randomness of mating may indicate inbreeding, population stratification, natural selection, or sampling bias. However, current methods use only observed numbers of genotypes and alleles, and do not take advantage of the fact that the advent of sequencing technology provides an opportunity to investigate this topic in unprecedented detail. To address this opportunity, a novel statistical test for random mating is required in population genomics studies for which large sequencing datasets are generally available. Here, we propose a Monte-Carlo-based-permutation test (MCP) as an approach to detect random mating. Computer simulations used to evaluate the performance of the permutation test indicate that its type I error is well controlled and that its statistical power is greater than that of the commonly used chi-square test (CHI). Our simulation study shows the power of our test is greater for datasets characterized by lower levels of migration between subpopulations. In addition, test power increases with increasing recombination rate, sample size, and divergence time of subpopulations. For populations exhibiting limited migration and having average levels of population divergence, the statistical power approaches 1 for sequences longer than 1Mbp and for samples of 400 individuals or more. Taken together, our results suggest that our permutation test is a valuable tool to detect random mating of populations, especially in population genomics studies.  相似文献   

9.
10.
Two methods of computing Monte Carlo estimators of variance components using restricted maximum likelihood via the expectation-maximisation algorithm are reviewed. A third approach is suggested and the performance of the methods is compared using simulated data.  相似文献   

11.
A popular way to represent clustered binary, count, or other data is via the generalized linear mixed model framework, which accommodates correlation through incorporation of random effects. A standard assumption is that the random effects follow a parametric family such as the normal distribution; however, this may be unrealistic or too restrictive to represent the data. We relax this assumption and require only that the distribution of random effects belong to a class of 'smooth' densities and approximate the density by the seminonparametric (SNP) approach of Gallant and Nychka (1987). This representation allows the density to be skewed, multi-modal, fat- or thin-tailed relative to the normal and includes the normal as a special case. Because an efficient algorithm to sample from an SNP density is available, we propose a Monte Carlo EM algorithm using a rejection sampling scheme to estimate the fixed parameters of the linear predictor, variance components and the SNP density. The approach is illustrated by application to a data set and via simulation.  相似文献   

12.
13.
14.
15.
16.
17.
Abstract

The Detailed Balance Energy-scaled Displacement Monte Carlo method that stems from the previously published Energy Scaled Displacement Monte Carlo method is presented. The results of tests performed on a dense Lennard-Jones liquid and on two particles in one dimension are reported.  相似文献   

18.
为了促进对四倍体拟南芥(A.suecica)的研究,阐明多倍体植物在染色体加倍过程中遗传物质的变化,从而在分子层面上解释多倍体植物的环境适应和进化机制,描述了一套基于第二代测序技术的转录组短序列组装和生物信息学分析方法.通过对23 000 000条来至于Illumina测序平台的序列数据进行SOAPdenovo组装,以...  相似文献   

19.
Information about relatedness between individuals in wild populations is advantageous when studying evolutionary, behavioural and ecological processes. Genomic data can be used to determine relatedness between individuals either when no prior knowledge exists or to confirm suspected relatedness. Here we present a set of 96 SNPs suitable for inferring relatedness for brown bears (Ursus arctos) within Scandinavia. We sequenced reduced representation libraries from nine individuals throughout the geographic range. With consensus reads containing putative SNPs, we applied strict filtering criteria with the aim of finding only high-quality, highly-informative SNPs. We tested 150 putative SNPs of which 96% were validated on a panel of 68 individuals. Ninety-six of the validated SNPs with the highest minor allele frequency were selected. The final SNP panel includes four mitochondrial markers, two monomorphic Y-chromosome sex-determination markers, three X-chromosome SNPs and 87 autosomal SNPs. From our validation sample panel, we identified two previously known parent-offspring dyads with reasonable accuracy. This panel of SNPs is a promising tool for inferring relatedness in the brown bear population in Scandinavia.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号