期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

PseKNC: A flexible web server for generating pseudo K-tuple nucleotide composition

Wei Chen Tian-Yu Lei Dian-Chuan Jin Hao Lin Kuo-Chen Chou 《Analytical biochemistry》2014

The pseudo oligonucleotide composition, or pseudo K-tuple nucleotide composition (PseKNC), can be used to represent a DNA or RNA sequence with a discrete model or vector yet still keep considerable sequence order information, particularly the global or long-range sequence order information, via the physicochemical properties of its constituent oligonucleotides. Therefore, the PseKNC approach may hold very high potential for enhancing the power in dealing with many problems in computational genomics and genome sequence analysis. However, dealing with different DNA or RNA problems may need different kinds of PseKNC. Here, we present a flexible and user-friendly web server for PseKNC (at http://lin.uestc.edu.cn/pseknc/default.aspx) by which users can easily generate many different modes of PseKNC according to their need by selecting various parameters and physicochemical properties. Furthermore, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the current web server to generate their desired PseKNC without the need to follow the complicated mathematical equations, which are presented in this article just for the integrity of PseKNC formulation and its development. It is anticipated that the PseKNC web server will become a very useful tool in computational genomics and genome sequence analysis. 相似文献

2.

iDNA-Methyl: Identifying DNA methylation sites via pseudo trinucleotide composition

Zi Liu Xuan Xiao Wang-Ren Qiu Kuo-Chen Chou 《Analytical biochemistry》2015

Predominantly occurring on cytosine, DNA methylation is a process by which cells can modify their DNAs to change the expression of gene products. It plays very important roles in life development but also in forming nearly all types of cancer. Therefore, knowledge of DNA methylation sites is significant for both basic research and drug development. Given an uncharacterized DNA sequence containing many cytosine residues, which one can be methylated and which one cannot? With the avalanche of DNA sequences generated during the postgenomic age, it is highly desired to develop computational methods for accurately identifying the methylation sites in DNA. Using the trinucleotide composition, pseudo amino acid components, and a dataset-optimizing technique, we have developed a new predictor called “iDNA-Methyl” that has achieved remarkably higher success rates in identifying the DNA methylation sites than the existing predictors. A user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/iDNA-Methyl, where users can easily get their desired results. We anticipate that the web-server predictor will become a very useful high-throughput tool for basic research and drug development and that the novel approach and technique can also be used to investigate many other DNA-related problems and genome analysis. 相似文献

3.

iDNA6mA-PseKNC: Identifying DNA N⁶-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC

Pengmian Feng Hui Yang Hui Ding Hao Lin Wei Chen Kuo-Chen Chou 《Genomics》2019,111(1):96-102

N⁶-methyladenine (6mA) is one kind of post-replication modification (PTM or PTRM) occurring in a wide range of DNA sequences. Accurate identification of its sites will be very helpful for revealing the biological functions of 6mA, but it is time-consuming and expensive to determine them by experiments alone. Unfortunately, so far, no bioinformatics tool is available to do so. To fill in such an empty area, we have proposed a novel predictor called iDNA6mA-PseKNC that is established by incorporating nucleotide physicochemical properties into Pseudo K-tuple Nucleotide Composition (PseKNC). It has been observed via rigorous cross-validations that the predictor's sensitivity (Sn), specificity (Sp), accuracy (Acc), and stability (MCC) are 93%, 100%, 96%, and 0.93, respectively. For the convenience of most experimental scientists, a user-friendly web server for iDNA6mA-PseKNC has been established at http://lin-group.cn/server/iDNA6mA-PseKNC, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved. 相似文献

4.

iTIS-PseTNC: A sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition

Wei Chen Peng-Mian Feng En-Ze Deng Hao Lin Kuo-Chen Chou 《Analytical biochemistry》2014

Translation is a key process for gene expression. Timely identification of the translation initiation site (TIS) is very important for conducting in-depth genome analysis. With the avalanche of genome sequences generated in the postgenomic age, it is highly desirable to develop automated methods for rapidly and effectively identifying TIS. Although some computational methods were proposed in this regard, none of them considered the global or long-range sequence-order effects of DNA, and hence their prediction quality was limited. To count this kind of effects, a new predictor, called “iTIS-PseTNC,” was developed by incorporating the physicochemical properties into the pseudo trinucleotide composition, quite similar to the PseAAC (pseudo amino acid composition) approach widely used in computational proteomics. It was observed by the rigorous cross-validation test on the benchmark dataset that the overall success rate achieved by the new predictor in identifying TIS locations was over 97%. As a web server, iTIS-PseTNC is freely accessible at http://lin.uestc.edu.cn/server/iTIS-PseTNC. To maximize the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web server to obtain the desired results without the need to go through detailed mathematical equations, which are presented in this paper just for the integrity of the new prection method. 相似文献

5.

iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition

Peng-Mian Feng Wei Chen Hao Lin Kuo-Chen Chou 《Analytical biochemistry》2013

Heat shock proteins (HSPs) are a type of functionally related proteins present in all living organisms, both prokaryotes and eukaryotes. They play essential roles in protein–protein interactions such as folding and assisting in the establishment of proper protein conformation and prevention of unwanted protein aggregation. Their dysfunction may cause various life-threatening disorders, such as Parkinson’s, Alzheimer’s, and cardiovascular diseases. Based on their functions, HSPs are usually classified into six families: (i) HSP20 or sHSP, (ii) HSP40 or J-class proteins, (iii) HSP60 or GroEL/ES, (iv) HSP70, (v) HSP90, and (vi) HSP100. Although considerable progress has been achieved in discriminating HSPs from other proteins, it is still a big challenge to identify HSPs among their six different functional types according to their sequence information alone. With the avalanche of protein sequences generated in the post-genomic age, it is highly desirable to develop a high-throughput computational tool in this regard. To take up such a challenge, a predictor called iHSP-PseRAAAC has been developed by incorporating the reduced amino acid alphabet information into the general form of pseudo amino acid composition. One of the remarkable advantages of introducing the reduced amino acid alphabet is being able to avoid the notorious dimension disaster or overfitting problem in statistical prediction. It was observed that the overall success rate achieved by iHSP-PseRAAAC in identifying the functional types of HSPs among the aforementioned six types was more than 87%, which was derived by the jackknife test on a stringent benchmark dataset in which none of HSPs included has ?40% pairwise sequence identity to any other in the same subset. It has not escaped our notice that the reduced amino acid alphabet approach can also be used to investigate other protein classification problems. As a user-friendly web server, iHSP-PseRAAAC is accessible to the public at http://lin.uestc.edu.cn/server/iHSP-PseRAAAC. 相似文献

6.

iPSW(2L)-PseKNC: A two-layer predictor for identifying promoters and their strength by hybrid features via pseudo K-tuple nucleotide composition

Xuan Xiao Zhao-Chun Xu Wang-Ren Qiu Peng Wang Hui-Ting Ge Kuo-Chen Chou 《Genomics》2019,111(6):1785-1793

相似文献

7.

Identify submitochondria and subchloroplast locations with pseudo amino acid composition: Approach from the strategy of discrete wavelet transform feature extraction

Shao-Ping Shi Jian-Ding Qiu Xing-Yu SunJian-Hua Huang Shu-Yun HuangSheng-Bao Suo Ru-Ping LiangLi Zhang 《Biochimica et Biophysica Acta (BBA)/Molecular Cell Research》2011,1813(3):424-430

It is very challenging and complicated to predict protein locations at the sub-subcellular level. The key to enhancing the prediction quality for protein sub-subcellular locations is to grasp the core features of a protein that can discriminate among proteins with different subcompartment locations. In this study, a different formulation of pseudoamino acid composition by the approach of discrete wavelet transform feature extraction was developed to predict submitochondria and subchloroplast locations. As a result of jackknife cross-validation, with our method, it can efficiently distinguish mitochondrial proteins from chloroplast proteins with total accuracy of 98.8% and obtained a promising total accuracy of 93.38% for predicting submitochondria locations. Especially the predictive accuracy for mitochondrial outer membrane and chloroplast thylakoid lumen were 82.93% and 82.22%, respectively, showing an improvement of 4.88% and 27.22% when other existing methods were compared. The results indicated that the proposed method might be employed as a useful assistant technique for identifying sub-subcellular locations. We have implemented our algorithm as an online service called SubIdent (http://bioinfo.ncu.edu.cn/services.aspx). 相似文献

8.

Using cellular automata images and pseudo amino acid composition to predict protein subcellular location 总被引：6，自引：0，他引：6

Xiao X Shao S Ding Y Huang Z Chou KC 《Amino acids》2006,30(1):49-54

Summary. The avalanche of newly found protein sequences in the post-genomic era has motivated and challenged us to develop an automated method that can rapidly and accurately predict the localization of an uncharacterized protein in cells because the knowledge thus obtained can greatly speed up the process in finding its biological functions. However, it is very difficult to establish such a desired predictor by acquiring the key statistical information buried in a pile of extremely complicated and highly variable sequences. In this paper, based on the concept of the pseudo amino acid composition (Chou, K. C. PROTEINS: Structure, Function, and Genetics, 2001, 43: 246–255), the approach of cellular automata image is introduced to cope with this problem. Many important features, which are originally hidden in the long amino acid sequences, can be clearly displayed through their cellular automata images. One of the remarkable merits by doing so is that many image recognition tools can be straightforwardly applied to the target aimed here. High success rates were observed through the self-consistency, jackknife, and independent dataset tests, respectively. 相似文献

9.

Using pseudo amino acid composition to predict transmembrane regions in protein: cellular automata and Lempel-Ziv complexity

Diao Y Ma D Wen Z Yin J Xiang J Li M 《Amino acids》2008,34(1):111-117

Summary. Transmembrane (TM) proteins represent about 20–30% of the protein sequences in higher eukaryotes, playing important roles across a range of cellular functions. Moreover, knowledge about topology of these proteins often provides crucial hints toward their function. Due to the difficulties in experimental structure determinations of TM protein, theoretical prediction methods are highly preferred in identifying the topology of newly found ones according to their primary sequences, useful in both basic research and drug discovery. In this paper, based on the concept of pseudo amino acid composition (PseAA) that can incorporate sequence-order information of a protein sequence so as to remarkably enhance the power of discrete models (Chou, K. C., Proteins: Structure, Function, and Genetics, 2001, 43: 246–255), cellular automata and Lempel-Ziv complexity are introduced to predict the TM regions of integral membrane proteins including both α-helical and β-barrel membrane proteins, validated by jackknife test. The result thus obtained is quite promising, which indicates that the current approach might be a quite potential high throughput tool in the post-genomic era. The source code and dataset are available for academic users at liml@scu.edu.cn. Authors’ address: Menglong Li, College of Chemistry, Sichuan University, Chengdu, Sichuan 610064, P.R. China 相似文献

10.

Prediction of protein homo-oligomer types by pseudo amino acid composition: Approached with an improved feature extraction and Naive Bayes Feature Fusion

Zhang SW Pan Q Zhang HC Shao ZC Shi JY 《Amino acids》2006,30(4):461-468

Summary. The interaction of non-covalently bound monomeric protein subunits forms oligomers. The oligomeric proteins are superior to the monomers within the scope of functional evolution of biomacromolecules. Such complexes are involved in various biological processes, and play an important role. It is highly desirable to predict oligomer types automatically from their sequence. Here, based on the concept of pseudo amino acid composition, an improved feature extraction method of weighted auto-correlation function of amino acid residue index and Naive Bayes multi-feature fusion algorithm is proposed and applied to predict protein homo-oligomer types. We used the support vector machine (SVM) as base classifiers, in order to obtain better results. For example, the total accuracies of A, B, C, D and E sets based on this improved feature extraction method are 77.63, 77.16, 76.46, 76.70 and 75.06% respectively in the jackknife test, which are 6.39, 5.92, 5.22, 5.46 and 3.82% higher than that of G set based on conventional amino acid composition method with the same SVM. Comparing with Chou’s feature extraction method of incorporating quasi-sequence-order effect, our method can increase the total accuracy at a level of 3.51 to 1.01%. The total accuracy improves from 79.66 to 80.83% by using the Naive Bayes Feature Fusion algorithm. These results show: 1) The improved feature extraction method is effective and feasible, and the feature vectors based on this method may contain more protein quaternary structure information and appear to capture essential information about the composition and hydrophobicity of residues in the surface patches that buried in the interfaces of associated subunits; 2) Naive Bayes Feature Fusion algorithm and SVM can be referred as a powerful computational tool for predicting protein homo-oligomer types. 相似文献

11.

Using pseudo amino acid composition to predict protein subcellular location: Approached with Lyapunov index, Bessel function, and Chebyshev filter 总被引：1，自引：2，他引：1

Gao Y Shao S Xiao X Ding Y Huang Y Huang Z Chou KC 《Amino acids》2005,28(4):373-376

Summary. With the avalanche of new protein sequences we are facing in the post-genomic era, it is vitally important to develop an automated method for fast and accurately determining the subcellular location of uncharacterized proteins. In this article, based on the concept of pseudo amino acid composition (Chou, K.C. Proteins: Structure, Function, and Genetics, 2001, 43: 246–255), three pseudo amino acid components are introduced via Lyapunov index, Bessel function, Chebyshev filter that can be more efficiently used to deal with the chaos and complexity in protein sequences, leading to a higher success rate in predicting protein subcellular location. 相似文献

12.

Gonad development and fatty acid composition of Patella depressa Pennant (Gastropoda: Prosobranchia) populations with different patterns of spatial distribution, in exposed and sheltered sites

Sofia Morais Lu?&#x;s Narciso Stephen J. Hawkins 《Journal of experimental marine biology and ecology》2003,294(1):61-80

The present study examines the effect of shore exposure on the feeding performance (assessed by fatty acid analyses of the whole body) and gonad condition (stage of development and gonad somatic index, GSI) of Patella depressa populations. Male and female limpets were collected at exposed and sheltered sites, during winter and summer. The population at the exposed site was at a more advanced stage of gonad development, with a higher dispersion of gonad stages, both in winter and summer. Additionally, limpets from the exposed site, particularly the males, presented a higher GSI than the corresponding stage in the sheltered site. The quantitatively most important fatty acids were the saturated fatty acids (SFA) 16:0, 14:0, and 18:0, the monounsaturated fatty acids (MUFA) 18:1(n−7), 18:1(n−9), 16:1(n−7) and 20:1(n−9) and the polyunsaturated fatty acids (PUFA) 20:5(n−3) and 20:4(n−6). Females had a significantly higher fatty acid methyl esters (FAME) content (in summer and winter) and higher amounts of SFA and MUFA (in summer), which points to a higher degree of storage of neutral lipids in this sex. Male and female limpets at the exposed site had a significantly higher FAME, SFA, MUFA, PUFA and highly unsaturated fatty acids (HUFA) content than the corresponding sex in the sheltered site in summer. In addition, an inversion in the eicosapentaenoic acid (EPA)/arachidonic acid (ARA) and (n−3)/(n−6) ratios was observed in the sheltered site, as a result of the significantly higher levels of ARA and (n−6) fatty acids and lower amounts of EPA and (n−3) fatty acids found in the sheltered limpets. A high variability among patches in the fatty acid composition in the exposed site was found in winter, possibly related to the aggregation of limpets at this time. The differences found between limpets from the exposed and sheltered sites suggest qualitative and quantitative differences in their diets. Additionally, the results show that the spatial aggregation strategy adopted by limpets in sites of great wave and wind exposure does not affect their feeding and reproductive success, at least in the site examined here. In fact, more developed gonads, a higher GSI and an elevated FAME content was found in the exposed population. Possible factors are suggested and discussed to explain these observations. 相似文献

13.

The mRNAs of maternally and paternally inherited mtDNAs of the mussel Mytilus galloprovincialis: Start/end points and polycistronic transcripts

Evanthia Chatzoglou Eleni Kyriakou Eleftherios Zouros George C. Rodakis 《Gene》2013

相似文献