首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 74 毫秒
1.
统计了110族同源蛋白质的多重结构联配结果,得到了局部环境依赖的氨基酸相似分数表。按此表可实施快速和优质的蛋白质结构刚性叠加,用于蛋白质工程。跟目前最好的遗传算法比较,此法可达到相同的品质(尽可能多的拓扑等价残基对和尽可能小的均方根距离),但CPU时间降了1个到2个数量级,对大蛋白更为明显。  相似文献   

2.
针对基因芯片数据缺失问题,利用蛋白质相互作用关系与基因表达的内在联系,提出了一种利用蛋白质相互作用信息提高基因芯片缺失数据估计精度的方法.将蛋白质间的相互作用关系与基因表达数据间的距离相结合来计算基因间的表达相似度,根据这个新的相似性度量标准为含有缺失数据的基因选择更为合适的用于估计缺失值的基因集合.将新的相似性度量标准与传统的KNNimpute、 LLSimpute方法相结合,描述了对应的改进算法PPI-KNNimpute、 PPI-LLSimpute.对真实的数据集测试表明,蛋白质相互作用信息能有效改善基因缺失数据估计的精度.  相似文献   

3.
基于质谱数据的蛋白质定量分析一直是目前高通量蛋白质组学的重要研究手段.但是基于现有质谱技术的限制,大规模蛋白质定量过程中往往会产生大量的缺失值,这在一定程度上影响了下游分析的准确性.尽管很多缺失值填补方法被不断提出,但是蛋白质组学领域对于不同情况下缺失值填补方法效力的综合评估仍然缺乏.本研究基于真实数据的分布特征,构建...  相似文献   

4.
张超  张晖  李冀新  高红 《生物信息学》2006,4(3):128-131
遗传算法源于自然界的进化规律,是一种自适应启发式概率性迭代式全局搜索算法。本文主要介绍了GA的基本原理,算法及优点;总结GA在蛋白质结构预测中建立模型和执行策略,以及多种算法相互结合预测蛋白质结构的研究进展。  相似文献   

5.
蛋白质的二级结构预测研究进展   总被引:1,自引:0,他引:1  
唐媛  李春花  张瑗  尚进  邹凌云  李立奇 《生物磁学》2013,(26):5180-5182
认识蛋白质的二级结构是了解蛋白质的折叠模式和三级结构的基础,并为研究蛋白质的功能以及它们之间的相互作用模式提供结构基础,同时还可以为新药研发提供帮助。故研究蛋白质的二级结构具有重要的意义。随着后基因组时代的到来,越来越多的蛋白质序列不断被发现,给蛋白质的二级结构研究带来巨大的挑战和研究空间。而依靠传统的实验方法很难获取大规模蛋白质的二级结构信息。目前,采用生物信息学手段仍然是获得大部分蛋白质二级结构的途径。近年来,许多研究者通过构建用于二级结构预测的蛋白质数据集,计算、提取蛋白质的各种特征信息,并采用不同的预测算法预测蛋白质的二级结构得到了快速的发展。本文拟从蛋白质的特征信息的提取与筛选、预测算法以及预测效果的检验方法等方面进行综述,介绍蛋白质二级结构预测领域的研究进展。相信随着基因组学、蛋白质组学和生物信息学的不断发展,蛋白质二级结构预测会不断取得新突破。  相似文献   

6.
本文独立地建立了用人工神经元网络预测蛋白质二级结构的方法,并通过分析我们提出的分布矩阵(表达每一类构象被预测成所有各类构象的可能性的矩阵),对于这一方法的误差以及造成误差的可能的原因进行了较过去更为深入的分析.并在此基础上提出了一种修正的学习方法,结果对于规则二级结构(α螺旋和β折叠)的预测精度和相关系数均有提高.  相似文献   

7.
给出了以疏水一亲水模型为基础的蛋白质设计方法,该方法以物理学原理为基础,以相对熵作为优化的目标函数。对四种不同结构类型的天然结构的真实蛋白质进行了检测,分析了影响检测成功率的主要因素,结果表明,该方法是普适的,可用于对不同结构类型的蛋白质设计序列。  相似文献   

8.
蛋白质二级结构的预测是生物信息学中一个重要的研究课题,在对蛋白质组的研究中也是最具难度的一个问题。进行二级结构预测对于理解蛋白质结构与功能的关系,以及分子设计、生物制药等领域都有重要的现实意义。同时也是一级结构与三级结构所联系的媒介,也为三级结构的研究打下基础。虽然目前预测的方法有几十种,但准确率最高的也只有70%多,本文对于目前方法进行分析,希望从中得到更加准确的方法。  相似文献   

9.
蛋白质结构预测方法探析   总被引:1,自引:0,他引:1  
刘云玲  陶兰 《生物信息学》2007,5(4):185-186
首先介绍了蛋白质结构预测中的三种理论方法,然后对同源蛋白质结构预测中侧链构造和环区构建中涉及到的主要方法进行了探讨,对非同源蛋白质结构预测中空间构象搜寻涉及到的主要算法进行了分析比较。  相似文献   

10.
石鸥燕  杨晶  杨惠云  田心 《现代生物医学进展》2007,7(11):1723-1724,1706
蛋白质二级结构预测对于我们了解蛋白质空间结构是至关重要的一步。文章提出了一种简单的二级结构预测方法,该方法采用多数投票法将现有的3种较好的二级结构预测方法的预测结果汇集形成一致性预测结果。从PDB数据库中随机选取近两年新测定结构的57条相似性小于30%的蛋白质,对该方法的预测结果进行测试,其Q3准确率比3种独立的方法提高了1.12—2.29%,相关系数及SOV准确率也有相应的提高。并且各项准确率均比同样采用一致性方法的Jpred二级结构预测程序准确率要高。这种预测方法虽然原理简单,但无须使用额外的参数,计算量小,易于实现,最重要的前提就是必须选用目前准确性比较出色的蛋白质二级结构预测方法。  相似文献   

11.
The use of a Gaussian-based representation of protein structures for evaluating protein-structure similarities and deriving three-dimensional superpositions is presented. The approach, as implemented in the program GAPS, is applied to three pairs of proteins with different topological characteristics (rich -helix, mixed -helix/-strand, and rich -strand), low sequence identities (10–30%), and recognized difficulties to define a unique optimum alignment.Validation of the GAPS superpositions is done by comparison with superpositions obtained by the TOP, GA_FIT, and ALIGN programs and those directly extracted from the FSSP database. Results suggest that a Gaussian-based methodology offers an objective means to, depending on the Gaussian-based representation, derive a consensus three-dimensional superposition when alternative superposition solutions exist.  相似文献   

12.
Ying Yuan  Guosheng Yin 《Biometrics》2010,66(1):105-114
Summary .  We study quantile regression (QR) for longitudinal measurements with nonignorable intermittent missing data and dropout. Compared to conventional mean regression, quantile regression can characterize the entire conditional distribution of the outcome variable, and is more robust to outliers and misspecification of the error distribution. We account for the within-subject correlation by introducing a   ℓ2   penalty in the usual QR check function to shrink the subject-specific intercepts and slopes toward the common population values. The informative missing data are assumed to be related to the longitudinal outcome process through the shared latent random effects. We assess the performance of the proposed method using simulation studies, and illustrate it with data from a pediatric AIDS clinical trial.  相似文献   

13.
Allelic dropout is a commonly observed source of missing data in microsatellite genotypes, in which one or both allelic copies at a locus fail to be amplified by the polymerase chain reaction. Especially for samples with poor DNA quality, this problem causes a downward bias in estimates of observed heterozygosity and an upward bias in estimates of inbreeding, owing to mistaken classifications of heterozygotes as homozygotes when one of the two copies drops out. One general approach for avoiding allelic dropout involves repeated genotyping of homozygous loci to minimize the effects of experimental error. Existing computational alternatives often require replicate genotyping as well. These approaches, however, are costly and are suitable only when enough DNA is available for repeated genotyping. In this study, we propose a maximum-likelihood approach together with an expectation-maximization algorithm to jointly estimate allelic dropout rates and allele frequencies when only one set of nonreplicated genotypes is available. Our method considers estimates of allelic dropout caused by both sample-specific factors and locus-specific factors, and it allows for deviation from Hardy–Weinberg equilibrium owing to inbreeding. Using the estimated parameters, we correct the bias in the estimation of observed heterozygosity through the use of multiple imputations of alleles in cases where dropout might have occurred. With simulated data, we show that our method can (1) effectively reproduce patterns of missing data and heterozygosity observed in real data; (2) correctly estimate model parameters, including sample-specific dropout rates, locus-specific dropout rates, and the inbreeding coefficient; and (3) successfully correct the downward bias in estimating the observed heterozygosity. We find that our method is fairly robust to violations of model assumptions caused by population structure and by genotyping errors from sources other than allelic dropout. Because the data sets imputed under our model can be investigated in additional subsequent analyses, our method will be useful for preparing data for applications in diverse contexts in population genetics and molecular ecology.  相似文献   

14.
氧化固醇结合蛋白结构、功能与应用   总被引:1,自引:0,他引:1  
氧化固醇结合蛋白(oxysterol binding protein,OSBP)是存在于真核细胞内的一类参与脂质代谢的非囊泡运输蛋白质,在哺乳动物中被称为氧化固醇结合蛋白相关蛋白质(oxysterol binding protein-related proteins,ORPs),而在酵母中被称为氧化固醇结合蛋白同源物质(oxysterol-binding protein homologues,OSH)。近年来人们对氧化固醇结合蛋白的研究不断深入,特别是对其同源蛋白质(例如,ORP5/8、Osh3/4、ORP4L等)的结构功能差异和其在信号转导中的作用的相关研究,以及在生物医药方面的应用更成为了本领域的热点。本文综述了关于OSBP及其同源蛋白质结构和功能的相关研究,指出了该领域存在的一些关键问题。与此同时,对OSH和ORPs在细胞内的膜接触位点(membrane contact sites,MCS)进行对比,以及对今后OSBP的研究方向做了展望。  相似文献   

15.
16.
Summary Often a binary variable is generated by dichotomizing an underlying continuous variable measured at a specific time point according to a prespecified threshold value. In the event that the underlying continuous measurements are from a longitudinal study, one can use the repeated‐measures model to impute missing data on responder status as a result of subject dropout and apply the logistic regression model on the observed or otherwise imputed responder status. Standard Bayesian multiple imputation techniques ( Rubin, 1987 , in Multiple Imputation for Nonresponse in Surveys) that draw the parameters for the imputation model from the posterior distribution and construct the variance of parameter estimates for the analysis model as a combination of within‐ and between‐imputation variances are found to be conservative. The frequentist multiple imputation approach that fixes the parameters for the imputation model at the maximum likelihood estimates and construct the variance of parameter estimates for the analysis model using the results of Robins and Wang (2000, Biometrika 87, 113–124) is shown to be more efficient. We propose to apply ( Kenward and Roger, 1997 , Biometrics 53, 983–997) degrees of freedom to account for the uncertainty associated with variance–covariance parameter estimates for the repeated measures model.  相似文献   

17.
Summary In a typical randomized clinical trial, a continuous variable of interest (e.g., bone density) is measured at baseline and fixed postbaseline time points. The resulting longitudinal data, often incomplete due to dropouts and other reasons, are commonly analyzed using parametric likelihood‐based methods that assume multivariate normality of the response vector. If the normality assumption is deemed untenable, then semiparametric methods such as (weighted) generalized estimating equations are considered. We propose an alternate approach in which the missing data problem is tackled using multiple imputation, and each imputed dataset is analyzed using robust regression (M‐estimation; Huber, 1973 , Annals of Statistics 1, 799–821.) to protect against potential non‐normality/outliers in the original or imputed dataset. The robust analysis results from each imputed dataset are combined for overall estimation and inference using either the simple Rubin (1987 , Multiple Imputation for Nonresponse in Surveys, New York: Wiley) method, or the more complex but potentially more accurate Robins and Wang (2000 , Biometrika 87, 113–124.) method. We use simulations to show that our proposed approach performs at least as well as the standard methods under normality, but is notably better under both elliptically symmetric and asymmetric non‐normal distributions. A clinical trial example is used for illustration.  相似文献   

18.
G. Y. Yi  W. Liu  Lang Wu 《Biometrics》2011,67(1):67-75
Summary Longitudinal data arise frequently in medical studies and it is common practice to analyze such data with generalized linear mixed models. Such models enable us to account for various types of heterogeneity, including between‐ and within‐subjects ones. Inferential procedures complicate dramatically when missing observations or measurement error arise. In the literature, there has been considerable interest in accommodating either incompleteness or covariate measurement error under random effects models. However, there is relatively little work concerning both features simultaneously. There is a need to fill up this gap as longitudinal data do often have both characteristics. In this article, our objectives are to study simultaneous impact of missingness and covariate measurement error on inferential procedures and to develop a valid method that is both computationally feasible and theoretically valid. Simulation studies are conducted to assess the performance of the proposed method, and a real example is analyzed with the proposed method.  相似文献   

19.
20.
Bootstrap is a time-honoured distribution-free approach for attaching standard error to any statistic of interest, but has not received much attention for data with missing values especially when using imputation techniques to replace missing values. We propose a proportional bootstrap method that allows effective use of imputation techniques for all bootstrap samples. Five detcnninistic imputation techniques are examined and particular emphasis is placed on the estimation of standard error for correlation coefficient. Some real data examples are presented. Other possible applications of the proposed bootstrap method are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号