首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 531 毫秒
1.
蛋白质分子的一切高级结构,都由一级结构即氨基酸残基序列所包含的信息决定。多年来,由蛋白质的氨基酸序列预测二级结构的方法不下十几种。其中,Chou和Fasman的方法自1974年提出,至1978年修正、精化,已得到了很好结果,越益受到重视。此方法的突出优点是简便,无须计算机的复杂分析,就可预测出蛋白质的二级结构,准确性约为80%。目前蛋白质二级结构的测定,当然以X-晶体衍射结果最准确。Chou和Fasman方法正是基于晶体分析的结果,经统计得出的一整套数据  相似文献   

2.
SARS冠状病毒全基因组序列初步分析   总被引:4,自引:0,他引:4  
对已经完成全序列测定的12个SARS病毒基因组进行了多序列比对,发现序列主体部分29708 b具有99.82%的相同碱基,除2个序列各有5个和6个碱基的缺失外,其余部分共有42个位点核苷酸碱基的差异,其中28个位点的碱基差异可引起氨基酸残基改变。利用蛋白质二级结构和跨膜螺旋预测以及蛋白质定位等生物信息学工具,分析了这些产生氨基酸改变部位的蛋白质构像,推测了可能产生的结构和功能改变,为进一步生物学实验提供参考。所有分析结果同时在北京大学生物信息中心抗SARS网站(antisars.cbi.pku.edu.cn)上发布。  相似文献   

3.
张静  顾宝洪 《动物学研究》1998,19(5):350-358
对编码成熟肽的mRNA二级结构的分析显示,每个密码子在mRNA二级结构中的位置有一定的倾向性,这种倾向性似乎与相应氨基酸的构象性质相一致。大多数编码疏水氨基酸的密码子位于mRNA二级结构中较稳定的茎区;反之,大多数编码亲水氨基酸的密码子位于柔性的环区。这个结果支持了最近得到的关于mRNA与蛋白质之间存在丰三维结构信息传递的结论。  相似文献   

4.
以H5N2亚型禽流感病毒毒株血凝素蛋白裂解位点碱性氨基酸为研究对象,对其密码子偏好性和对应mRNA序列的折叠二级结构特点进行研究和分析。旨在探讨裂解位点氨基酸对应mRNA核苷酸片段的二级结构与病毒致病力的关系,希望能对禽流感病毒的研究提供一些基础性信息。将mRNA样本按照序列等步长递增的方法,用RNAstructure 4.1程序预测这些样本的动态延伸折叠二级结构。序列和结构的分析结果:裂解位点的碱性氨基酸对富含腺嘌呤的密码子有强烈偏好;与碱性氨基酸对应的mRNA片段上的核苷酸主要位于折叠二级结构的单链环区,少数位于配对螺旋区。结果表明:裂解位点氨基酸对应的mRNA核苷酸形成发夹端环的大小与其碱性氨基酸的多少具有正相关性。  相似文献   

5.
[目的]获得小叶章PPDK的cDNA序列,推测蛋白氨基酸理化性质及二三级结构等,以及已知物种PPDK氨基酸系统进化分析。[方法]利用RT-PCR技术扩增PPDK序列,通过ProtParam、Blast、THMHMM、SOPMA等生物信息软件分析序列。[结果]获得了小叶章PPDK的c DNA序列,长1 035 bp,相对分子量41. 910 kDa,等电点为9. 70,氨基酸中含量最高的是丝氨酸,72个;蛋白二级结构中无规卷曲占48. 83%,不含信号肽和跨膜结构,氨基酸系统进化树结果分析发现小叶章PPDK与二穗短柄草、水稻亲缘关系最近。[结论]小叶章PPDK含有385个氨基酸,含1个PDRP活性调控位点,72个参与信号传导的磷酸化位点。  相似文献   

6.
蛋白质序列中的关联规则发现及其应用   总被引:2,自引:0,他引:2  
随着蛋白质序列-结构分析中使用的机器学习算法越来越复杂,其结果的解释和发现过程也随之复杂化,因此有必要寻找简单且理论上可靠的方法。通过引入原理简单、理论可靠、结果具有很强实际意义的关联规则发现算法,找到了蛋白质序列中数以万计的模式。结合实例演示了如何将这些模式应用于蛋白质序列分析中,如保守区域发现、二级结构预测等。同时根据这些结果构建了一个二级结构规则库和一种简单的二级结构预测算法,实验结果表明,约81%的二级结构可以由至少一条关联规则预测得到。  相似文献   

7.
在GenBank中检索A组轮状病毒不同血清型的VP7基因信息,在氨基酸水平上与G10血清型LLR株的VP7序列进行序列对比,分析其血清型特异的氨基酸保守序列位点。结合蛋白质二级结构预测理论方案,设计合成3条具有轮状病毒G10血清型特异性氨基酸序列的多肽。通过检测合成肽对轮状病毒免疫血清与LLR抗原的结合抑制,证实三条多肽均具备了LLR表位属性。  相似文献   

8.
为研究贵州黑山羊、贵州白山羊MHC-DRB1基因遗传机制,试验运用混合DNA池结合PCR产物直接测序方法,对MHC-DRB1基因第三外显子区域进行多态性分析,利用生物信息学分析软件对PCR扩增所获序列进行m RNA二级结构及蛋白质的二级结构和抗原表位分析。经序列对比发现9个SNPs位点,其中A~(8858)G(ILe-Val)、G~(8969)A(Gly-Ser)、G~(8978)A(Glu-Lys)、T~(9094)G(Asp-Glu)4个单核苷酸突变,导致所编码氨基酸发生改变,其它5个位点均属于同义突变。进一步分析发现,A~(8858)G、C~(8974)T、T~(9094)G、C~(9100)T、G~(9124)A突变位点导致m RNA二级结构的变化,A~(8858)G以及T~(9094)G对蛋白质二级结构影响最大。  相似文献   

9.
以日本鳗鲡的PYY基因cDNA序列为信息探针,搜索中华鲟EST文库,得到中华鲟PYY的EST序列,经过生物信息学分析。结果表明,此cDNA序列包含完整的开放读码框,所编码的蛋白质包含97个氨基酸,前28个氨基酸为信号肽,分子量为11.03kD,理论等电点为5.54。该蛋白序列和牙鲆、河豚以及斑马鱼的PYY肽同源性较高。其中36个氨基酸构成PAH结构域,由一个α螺旋结构和一个无规则卷曲区组成。包含中华鲟PYY在内的139条真核生物神经肽Y蛋白质序列比对结果显示,有6个氨基酸位点高度保守,其中有5个氨基酸位于α螺旋上,另外1个脯氨酸位于无规则卷曲区,提示α螺旋也可能起到重要作用。  相似文献   

10.
随着人类基因组和一些模式生物、重要经济生物以及大量微生物基因组测序的完成,生物学整体研究业已进入基因组时代.最近5~10年以来,利用基因组结构信息进行系统发育推断的研究形成了分类学和进化生物学中的前沿领域之一.相对于核苷酸或氨基酸序列中的突变而言,基因组的结构变化--内含子的插入/缺失、反转录子的整合、签名序列、基因重复以及基因排序等--是更大空间(或者时间空间)尺度上的相对稀缺的系统发育信息,一般用于科和科以上阶元间的亲缘关系研究.基因组全序列的获得和其中各基因位置的确定有利于将基因组中不同层次的系统发育信息综合起来,利用全面分子证据(total molecular evidence;包括基因组信息,DNA、RNA、蛋白质的序列信息,RNA和蛋白质的高级结构等)进行分子系统学研究.  相似文献   

11.
Lee S  Lee BC  Kim D 《Proteins》2006,62(4):1107-1114
Knowing protein structure and inferring its function from the structure are one of the main issues of computational structural biology, and often the first step is studying protein secondary structure. There have been many attempts to predict protein secondary structure contents. Previous attempts assumed that the content of protein secondary structure can be predicted successfully using the information on the amino acid composition of a protein. Recent methods achieved remarkable prediction accuracy by using the expanded composition information. The overall average error of the most successful method is 3.4%. Here, we demonstrate that even if we only use the simple amino acid composition information alone, it is possible to improve the prediction accuracy significantly if the evolutionary information is included. The idea is motivated by the observation that evolutionarily related proteins share the similar structure. After calculating the homolog-averaged amino acid composition of a protein, which can be easily obtained from the multiple sequence alignment by running PSI-BLAST, those 20 numbers are learned by a multiple linear regression, an artificial neural network and a support vector regression. The overall average error of method by a support vector regression is 3.3%. It is remarkable that we obtain the comparable accuracy without utilizing the expanded composition information such as pair-coupled amino acid composition. This work again demonstrates that the amino acid composition is a fundamental characteristic of a protein. It is anticipated that our novel idea can be applied to many areas of protein bioinformatics where the amino acid composition information is utilized, such as subcellular localization prediction, enzyme subclass prediction, domain boundary prediction, signal sequence prediction, and prediction of unfolded segment in a protein sequence, to name a few.  相似文献   

12.
Protein structure information is very useful for the confirmation of protein function. The protein structural class can provide information for protein 3D structure analysis, causing the conformation of the protein overall folding type plays a significant part in molecular biology. In this paper, we focus on the prediction of protein structural class which was based on new feature representation. We extract features from the Chou-Fasman parameter, amino acid compositions, amino acids hydrophobicity features, polarity information and pair-coupled amino acid composition. The prediction result by the Support vector machine (SVM) classifier shows that our method is better than some others.  相似文献   

13.
随机森林方法预测膜蛋白类型   总被引:2,自引:0,他引:2  
膜蛋白的类型与其功能是密切相关的,因此膜蛋白类型的预测是研究其功能的重要手段,从蛋白质的氨基酸序列出发对膜蛋白的类型进行预测有重要意义。文章基于蛋白质的氨基酸序列,将组合离散增量和伪氨基酸组分信息共同作为预测参数,采用随机森林分类器,对8类膜蛋白进行了预测。在Jackknife检验下的预测精度为86.3%,独立检验的预测精度为93.8%,取得了好于前人的预测结果。  相似文献   

14.
通过研究神经网络权值矩阵的算法,挖掘蛋白质二级结构与氨基酸序列间的内在规律,提高一级序列预测二级结构的准确度。神经网络方法在特征分类方面具有良好表现,经过学习训练后的神经元连接权值矩阵包含样本的内在特征和规律。研究使用神经网络权值矩阵打分预测;采用错位比对方法寻找敏感的氨基酸邻域;分析测试集在不同加窗长度下的共性表现。实验表明,在滑动窗口长度L=7时,预测性能变化显著;邻域位置P=4的氨基酸残基对预测性能有加强作用。该研究方法为基于局部序列特征的蛋白质二级结构预测提供了新的算法设计。  相似文献   

15.
Intrinsically unstructured proteins (IUPs) are proteins lacking a fixed three dimensional structure or containing long disordered regions. IUPs play an important role in biology and disease. Identifying disordered regions in protein sequences can provide useful information on protein structure and function, and can assist high-throughput protein structure determination. In this paper we present a system for predicting disordered regions in proteins based on decision trees and reduced amino acid composition. Concise rules based on biochemical properties of amino acid side chains are generated for prediction. Coarser information extracted from the composition of amino acids can not only improve the prediction accuracy but also increase the learning efficiency. In cross-validation tests, with four groups of reduced amino acid composition, our system can achieve a recall of 80% at a 13% false positive rate for predicting disordered regions, and the overall accuracy can reach 83.4%. This prediction accuracy is comparable to most, and better than some, existing predictors. Advantages of our approach are high prediction accuracy for long disordered regions and efficiency for large-scale sequence analysis. Our software is freely available for academic use upon request.  相似文献   

16.
Due to advances in molecular biology the DNA sequences of structural genes coding for proteins are often known before a protein is characterized or even isolated. The function of a protein whose amino acid sequence has been deduced from a DNA sequence may not even be known. This has created greater interest in the development of methods to predict the tertiary structures of proteins. The a priori prediction of a protein's structure from its amino acid sequence is not yet possible. However, since proteins with similar amino acid sequences are observed to have similar three-dimensional structures, it is possible to use an analogy with a protein of known structure to draw some conclusions about the structure and properties of an uncharacterized protein. The process of predicting the tertiary structure of a protein relies very much upon computer modeling and analysis of the structure. The prediction of the structure of the bacteriophage 434 cro repressor is used as an example illustrating current procedures.  相似文献   

17.
Protein structure prediction: inroads to biology   总被引:1,自引:0,他引:1  
Petrey D  Honig B 《Molecular cell》2005,20(6):811-819
  相似文献   

18.
It has long been suspected that analysis of correlated amino acid substitutions should uncover pairs or clusters of sites that are spatially proximal in mature protein structures. Accordingly, methods based on different mathematical principles such as information theory, correlation coefficients and maximum likelihood have been developed to identify co-evolving amino acids from multiple sequence alignments. Sets of pairs of sites whose behaviour is identified by these methods as correlated are often significantly enriched in pairs of spatially proximal residues. However, relatively high levels of false-positive predictions typically render such methods, in isolation, of little use in the ab initio prediction of protein structure. Misleading signal (or problems with the estimation of significance levels) can be caused by phylogenetic correlations between homologous sequences and from correlation due to factors other than spatial proximity (for example, correlation of sites which are not spatially close but which are involved in common functional properties of the protein). In recent years, several workers have suggested that information from correlated substitutions should be combined with other sources of information (secondary structure, solvent accessibility, evolutionary rates) in an attempt to reduce the proportion of false-positive predictions. We review methods for the detection of correlated amino acid substitutions, compare their relative performance in contact prediction and predict future directions in the field.  相似文献   

19.
20.
Chao Fang  Yi Shang  Dong Xu 《Proteins》2018,86(5):592-598
Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception‐inside‐inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD‐SS. The input to MUFOLD‐SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio‐chemical properties of amino acids, PSI‐BLAST profile, and HHBlits profile. MUFOLD‐SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD‐SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD‐SS outperformed the best existing methods and other deep neural networks significantly. MUFold‐SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html .  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号