首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The sequencing of various genomes has inaugurated a new stage in the understanding of normal and pathological cell function through the analysis of the role of proteins. Proteins, after all, that intervene in the different molecular mechanisms of life, during growth, reproduction, and in the interaction between cells, thus making it possible to describe the biology of integrated systems. In this article, we briefly describe the various stages in the progression of our knowledge, from the genome to the "functional" proteome. Emphasis is placed on a global approach to the protein-protein interactions used to describe the cellular "interactome".  相似文献   

2.
3.
The CATH database of protein structures contains approximately 18000 domains organized according to their (C)lass, (A)rchitecture, (T)opology and (H)omologous superfamily. Relationships between evolutionary related structures (homologues) within the database have been used to test the sensitivity of various sequence search methods in order to identify relatives in Genbank and other sequence databases. Subsequent application of the most sensitive and efficient algorithms, gapped blast and the profile based method, Position Specific Iterated Basic Local Alignment Tool (PSI-BLAST), could be used to assign structural data to between 22 and 36 % of microbial genomes in order to improve functional annotation and enhance understanding of biological mechanism. However, on a cautionary note, an analysis of functional conservation within fold groups and homologous superfamilies in the CATH database, revealed that whilst function was conserved in nearly 55% of enzyme families, function had diverged considerably, in some highly populated families. In these families, functional properties should be inherited far more cautiously and the probable effects of substitutions in key functional residues carefully assessed.  相似文献   

4.
Proteomics has become an important approach for investigating cellular processes and network functions. Significant improvements have been made during the last few years in technologies for high-throughput proteomics, both at the level of data analysis software and mass spectrometry hardware. As proteomics technologies advance and become more widely accessible, efforts of cataloguing and quantifying full proteomes are underway to complement other genomics approaches, such as RNA and metabolite profiling. Of particular interest is the application of proteome data to improve genome annotation and to include information on post-translational protein modifications with the annotation of the corresponding gene. This type of analysis requires a paradigm shift because amino acid sequences must be assigned to peptides without relying on existing protein databases. In this review, advances and current limitations of full proteome analysis are briefly highlighted using the model plant Arabidopsis thaliana as an example. Strategies to identify peptides are also discussed on the basis of MS/MS data in a protein database-independent approach.  相似文献   

5.
6.
Albrecht-Buehler G 《Gene》2012,498(1):20-27
The existence of fractal sets of DNA sequences have long been suspected on the basis of statistical analyses of genome data. In this article we identify for the first time explicitly the GA-sequences as a class of fractal genomic sequences that are easy to recognize and to extract, and are scattered densely throughout the chromosomes of a large number of genomes from different species and kingdoms including the human genome. Their existence and their fractality may have significant consequences for our understanding of the origin and evolution of genomes. Furthermore, as universal and natural markers they may be used to chart and explore the non-coding regions.  相似文献   

7.
8.
9.
10.
11.
12.
基因组序列k-mer的非随机使用规律及包含的生物学意义一直是人们关注的问题,目前还没有根本性进展。本文以七个物种的全部基因序列为样本,得到各物种基因组序列的8-mer频谱分布。发现狗和牛的频谱有三个峰,而斑马鱼、青鳉鱼、秀丽线虫和酿酒酵母的频谱只有一个峰,鸡的频谱分布形状介于两者之间。将8-mer集合按照XY二核苷含量分类,结果显示只有CG二核苷分类下0CG、1CG和2CG三类子集的频谱形成各自独立的单峰分布。对照随机序列,发现0CG模体是随机进化的,1CG和2CG模体是定向进化的,它们的使用频次远小于随机频次,且这种独立进化分离规律具有物种普适性。三个CG子集频谱之间的距离是产生单峰或多峰现象的根本原因。将七个物种基因组序列标准化到109bp,比较发现1CG和2CG子集频谱与物种进化显著相关,0CG子集频谱与物种进化无显著关系。可以认为三种CG模体各自执行着不同的生物学功能。基因组序列8-mer的独立分离规律为揭示基因组结构、基因组进化以及模体的生物功能提供了一种新的思维方式。  相似文献   

13.
Structural proteomics: a tool for genome annotation   总被引:1,自引:0,他引:1  
In any newly sequenced genome, 30% to 50% of genes encode proteins with unknown molecular or cellular function. Fortunately, structural genomics is emerging as a powerful approach of functional annotation. Because of recent developments in high-throughput technologies, ongoing structural genomics projects are generating new structures at an unprecedented rate. In the past year, structural studies have identified many new structural motifs involved in enzymatic catalysis or in binding ligands or other macromolecules (DNA, RNA, protein). The efficiency by which function is deduced from structure can be further improved by the integration of structure with bioinformatics and other experimental approaches, such as screening for enzymatic activity or ligand binding.  相似文献   

14.
Ductal infiltrating carcinoma (DIC) of the breast is the most common and potentially aggressive form of cancer. Knowledge of proteomic profiles, attained both in vivo and in vitro, is fundamental to acquire as much information as possible on the proteins expressed in these pathologic conditions. We used the breast cancer cell line 8701-BC, established from a primary DIC, with the aim of contributing to the databases on mammary cancer cells, which in turn will be very useful for the identification of differentially expressed proteins in normal and neoplastic cells. Within an analysis window comprising about 1750 discernible spots, we have at present catalogued 84 protein spots. The proteins for which an identity was assigned were identified essentially using gel comparison, N-terminal (Nt) microseqencing and immune detection. Among the protein spots Nt-microsequenced, sixteen corresponded to known proteins, four resulted as modified, relative to matching sequences deposited on databases, and seven were unknown. These modified or novel sequences are thus of potential interest to the knowledge of breast cancer proteomics and its applications.  相似文献   

15.
16.
To understand and eventually predict the effects of changing redox conditions and oxidant levels on the physiology of an organism, it is essential to gain knowledge about its redoxome: the proteins whose activities are controlled by the oxidation status of their cysteine thiols. Here, we applied the quantitative redox proteomic method OxICAT to Saccharomyces cerevisiae and determined the in vivo thiol oxidation status of almost 300 different yeast proteins distributed among various cellular compartments. We found that a substantial number of cytosolic and mitochondrial proteins are partially oxidized during exponential growth. Our results suggest that prevailing redox conditions constantly control central cellular pathways by fine-tuning oxidation status and hence activity of these proteins. Treatment with sublethal H(2)O(2) concentrations caused a subset of 41 proteins to undergo substantial thiol modifications, thereby affecting a variety of different cellular pathways, many of which are directly or indirectly involved in increasing oxidative stress resistance. Classification of the identified protein thiols according to their steady-state oxidation levels and sensitivity to peroxide treatment revealed that redox sensitivity of protein thiols does not predict peroxide sensitivity. Our studies provide experimental evidence that the ability of protein thiols to react to changing peroxide levels is likely governed by both thermodynamic and kinetic parameters, making predicting thiol modifications challenging and de novo identification of peroxide sensitive protein thiols indispensable.  相似文献   

17.
Hai ming Ni  Da wei Qi  Hongbo Mu 《Genomics》2018,110(3):180-190
Converting DNA sequence to image by using chaos game representation (CGR) is an effective genome sequence pretreatment technology, which provides the basis for further analysis between the different genes. In this paper, we have constructed 10 mammal species, 48 hepatitis E virus (HEV), and 10 kinds of bacteria genetic CGR images, respectively, to calculate the mean structural similarity (MSSIM) coefficient between every two CGR images. From our analysis, the MSSIM coefficient of gene CGR images can accurately reflect the similarity degrees between different genomes. Hierarchical clustering analysis was used to calculate the class affiliation and construct a dendrogram. Large numbers of experiments showed that this method gives comparable results to the traditional Clustal X phylogenetic tree construction method, and is significantly faster in the clustering analysis process. Meanwhile MSSIM combined CGR method was also able to efficiently clustering of large genome sequences, which the traditional multiple sequence alignment methods (e.g. Clustal X, Clustal Omega, Clustal W, et al.) cannot classify.  相似文献   

18.
While genome sequencing efforts reveal the basic building blocksof life, a genome sequence alone is insufficient for elucidatingbiological function. Genome annotation—the process ofidentifying genes and assigning function to each gene in a genomesequence—provides the means to elucidate biological functionfrom sequence. Current state-of-the-art high-throughput genomeannotation uses a combination of comparative (sequence similaritydata) and non-comparative (ab initio gene prediction algorithms)methods to identify protein-coding genes in genome sequences.Because approaches used to validate the presence of predictedprotein-coding genes are typically based on expressed RNA sequences,they cannot independently and unequivocally determine whethera predicted protein-coding gene is translated into a protein.With the ability to directly measure peptides arising from expressedproteins, high-throughput liquid chromatography-tandem massspectrometry-based proteomics approaches can be used to verifycoding regions of a genomic sequence. Here, we highlight severalways in which high-throughput tandem mass spectrometry-basedproteomics can improve the quality of genome annotations andsuggest that it could be efficiently applied during the genecalling process so that the improvements are propagated throughthe subsequent functional annotation process.   相似文献   

19.
Using the transcriptome to annotate the genome   总被引:35,自引:0,他引:35  
A remaining challenge for the human genome project involves the identification and annotation of expressed genes. The public and private sequencing efforts have identified approximately 15,000 sequences that meet stringent criteria for genes, such as correspondence with known genes from humans or other species, and have made another approximately 10,000-20,000 gene predictions of lower confidence, supported by various types of in silico evidence, including homology studies, domain searches, and ab initio gene predictions. These computational methods have limitations, both because they are unable to identify a significant fraction of genes and exons and because they are unable to provide definitive evidence about whether a hypothetical gene is actually expressed. As the in silico approaches identified a smaller number of genes than anticipated, we wondered whether high-throughput experimental analyses could be used to provide evidence for the expression of hypothetical genes and to reveal previously undiscovered genes. We describe here the development of such a method--called long serial analysis of gene expression (LongSAGE), an adaption of the original SAGE approach--that can be used to rapidly identify novel genes and exons.  相似文献   

20.
In this study, an in silico approach was developed to identify homologies existing between livestock microsatellite flanking sequences and GenBank nucleotide sequences. Initially, 1955 bovine, 1570 porcine and 1121 chicken microsatellites were downloaded and the flanking sequences were compared with the nr and dbEST databases of GenBank. A total of 74 bovine, 44 porcine and 37 chicken microsatellite flanking sequences passed our criteria and had at least one significant match to human genomic sequence, genes/expressed sequence tags (ESTs) or both. GenBank annotation and BLAT searches of the UCSC human genome assembly revealed that 38 bovine, 13 porcine and 17 chicken microsatellite flanking sequences were highly similar to known human genes. Map locations were available for 67 bovine, 44 porcine and 21 chicken microsatellite flanking sequences, providing useful links in the comparative maps of humans and livestock. In support of our approach, 112 alignments with both microsatellite and match mapping information were located in the expected chromosomal regions based on previously reported syntenic relationships. The development of this in silico mapping approach has significantly increased the number of genes and EST sequences anchored to the bovine, porcine and chicken genome maps and the number of links in various human-livestock comparative maps.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号