首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 14 毫秒
1.
A detailed knowledge of a protein's functional site is an absolute prerequisite for understanding its mode of action at the molecular level. However, the rapid pace at which sequence and structural information is being accumulated for proteins greatly exceeds our ability to determine their biochemical roles experimentally. As a result, computational methods are required which allow for the efficient processing of the evolutionary information contained in this wealth of data, in particular that related to the nature and location of functionally important sites and residues. The method presented here, referred to as conserved functional group (CFG) analysis, relies on a simplified representation of the chemical groups found in amino acid side-chains to identify functional sites from a single protein structure and a number of its sequence homologues. We show that CFG analysis can fully or partially predict the location of functional sites in approximately 96% of the 470 cases tested and that, unlike other methods available, it is able to tolerate wide variations in sequence identity. In addition, we discuss its potential in a structural genomics context, where automation, scalability and efficiency are critical, and an increasing number of protein structures are determined with no prior knowledge of function. This is exemplified by our analysis of the hypothetical protein Ydde_Ecoli, whose structure was recently solved by members of the North East Structural Genomics consortium. Although the proposed active site for this protein needs to be validated experimentally, this example illustrates the scope of CFG analysis as a general tool for the identification of residues likely to play an important role in a protein's biochemical function. Thus, our method offers a convenient solution to rapidly and automatically process the vast amounts of data that are beginning to emerge from structural genomics projects.  相似文献   

2.
本研究采用Illumina HiSeq TM 2500测序平台对阿尔泰蝠蛾Hepialus altaicola Wang幼虫进行转录组测序及生物信息学分析.经序列拼接后共获得100133个Unigenes,总长度86319112 bp,平均长度862 bp,N50长度1628 bp.将Unigenes与NR、COG/KOG、Pfam、Swiss-Prot、GO、KEGG数据库比对,共获得38198条Unigenes,其中Nr数据库注释的Unigenes最多,为32381条,占32.34%.通过GO功能分类,共有13216个Unigenes在GO数据库中细胞组分、分子功能和生物学过程等3大类57个分支中找到注释;KEGG通路分析,共有15058条Unigenes被注释,归属于305条代谢通路.CDS预测发现54002条序列可被编码,占全部基因的53.93%.基因注释进一步获得311个与冷适应相关的代谢调节基因,并用FPKM值对基因表达量进行评估.本研究获得的转录组信息及分析结果,为进一步研究阿尔泰蝠蛾的基因功能及低温生态适应性奠定分子基础.  相似文献   

3.
4.
Oryza sativa (rice) plays an essential food security role for more than half of the world’s population. Obtaining crops with high levels of disease resistance is a major challenge for breeders, especially today, given the urgent need for agriculture to be more sustainable. Plant resistance genes are mainly encoded by three large leucine-rich repeat (LRR)-containing receptor (LRR-CR) families: the LRR-receptor-like kinase (LRR-RLK), LRR-receptor-like protein (LRR-RLP) and nucleotide-binding LRR receptor (NLR). Using lrrprofiler , a pipeline that we developed to annotate and classify these proteins, we compared three publicly available annotations of the rice Nipponbare reference genome. The extended discrepancies that we observed for LRR-CR gene models led us to perform an in-depth manual curation of their annotations while paying special attention to nonsense mutations. We then transferred this manually curated annotation to Kitaake, a cultivar that is closely related to Nipponbare, using an optimized strategy. Here, we discuss the breakthrough achieved by manual curation when comparing genomes and, in addition to ‘functional’ and ‘structural’ annotations, we propose that the community adopts this approach, which we call ‘comprehensive’ annotation. The resulting data are crucial for further studies on the natural variability and evolution of LRR-CR genes in order to promote their use in breeding future resilient varieties.  相似文献   

5.
The complete nucleotide sequence (501,020 bp) of the mitochondrial genome from cytoplasmic male-sterile (CMS) sugar beet was determined. This enabled us to compare the sequence with that previously published for the mitochondrial genome of normal, male-fertile sugar beet. The comparison revealed that the two genomes have the same complement of genes of known function. The rRNA and tRNA genes encoded in the CMS mitochondrial genome share 100% sequence identity with their respective counterparts in the normal genome. We found a total of 24 single nucleotide substitutions in 11 protein genes encoded by the CMS mitochondrial genome. However, none of these seems to be responsible for male sterility. In addition, several other ORFs were found to be actively transcribed in sugar beet mitochondria. Among these, Norf246 was observed to be present in the normal mitochondrial genome but absent from the CMS genome. However, it seems unlikely that the loss of Norf246 is causally related to the expression of CMS, because previous studies on mitochondrial translation products failed to detect the product of this ORF. Conversely, the CMS genome contains four transcribed ORFs (Satp6presequence, Scox2-2 , Sorf324 and Sorf119) which are missing from the normal genome. These ORFs, which are potential candidates for CMS genes, were shown to be generated by mitochondrial genome rearrangements.Electronic Supplementary Material Supplementary material is available in the online version of this article at Communicated by R. Hagemann  相似文献   

6.
Anaplastic thyroid cancer (ATC) has a high degree of malignancy and poor prognosis. The purpose of this study was to determine differentially expressed genes (DEGs) in ATC through biometric analysis technology, clarify potential interactions between them, and screen genes related to the prognosis of ATC. Using obtained DEGs, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), Protein-protein interaction (PPI), and survival analysis were performed. After R integration analysis of the four datasets, 764 DEGs were obtained, i.e., 314 upregulated genes and 450 downregulated genes. Among the hub DEGs obtained from the PPI network, the expression levels of TYMS, FN1, CHRDL1, SDC2, ITGA2, COL1A1, COL9A3, and COL23A1 were associated with ATC prognosis. These results showed that the recurrence-free survival (RFS) of ATC was associated with TYMS, FN1, ITGA2, COL23A1, SDC2, and CHRDL1 statistically significantly in the KM plotter (P<0.05). Thus, the study suggests that TYMS, FN1, ITGA2, COL23A1, SDC2, and CHRDL1 may be used as potential biomarkers of ATC. These findings provide new insights for the detection of novel diagnostic and therapeutic biomarkers for ATC.  相似文献   

7.
8.
Cancer cells reprogram their metabolism to support growth and invasion. While previous work has highlighted how single altered reactions and pathways can drive tumorigenesis, it remains unclear how individual changes propagate at the network level and eventually determine global metabolic activity. To characterize the metabolic lifestyle of cancer cells across pathways and genotypes, we profiled the intracellular metabolome of 180 pan‐cancer cell lines grown in identical conditions. For each cell line, we estimated activity for 49 pathways spanning the entirety of the metabolic network. Upon clustering, we discovered a convergence into only two major metabolic types. These were functionally confirmed by 13C‐flux analysis, lipidomics, and analysis of sensitivity to perturbations. They revealed that the major differences in cancers are associated with lipid, TCA cycle, and carbohydrate metabolism. Thorough integration of these types with multiomics highlighted little association with genetic alterations but a strong association with markers of epithelial–mesenchymal transition. Our analysis indicates that in absence of variations imposed by the microenvironment, cancer cells adopt distinct metabolic programs which serve as vulnerabilities for therapy.  相似文献   

9.
10.
X Wu  X Li  L Li  X Xu  J Xia  Z Yu 《Gene》2012,507(2):112-118
A feasible way to perform evolutionary analyses is to compare characters divergent enough to observe significant differences, but sufficiently similar to exclude saturation of the differences that occurred. Thus, comparisons of invertebrate mitochondrial (mt) genomes at low taxonomic levels can be extremely helpful in investigating patterns of variation and evolutionary dynamics of genomes, as intermediate stages of the process may be identified. Fortunately, in this study, we newly sequenced the mt genome of the eighth member of Asian Crassostrea oysters which can provide necessary intermediate characters for us to believe that the variation of Crassostrea mt genomes is considerably greater than previously acknowledged. Several new features of Asian Crassostrea oyster mitochondrial genomes were revealed, and our results are particularly significant as they 1) suggest a novel model of alloacceptor tRNA gene recruitment, namely "vertical" tRNA gene recruitment, which can be successfully used to explain the origination of the unusually additional trnK and trnQ genes (annotated as trnK(2) and trnQ(2) respectively) in the mt genomes of the five Asian oysters, and we speculate that this recruitment progress may be a common phenomenon in the evolution of the tRNA multigene family; 2) reveal the existence of two additional, lineage-specific, mtDNA-encoded genes that may originate from duplication of nad2 followed by rapid evolutionary change. Each of these two genes encodes a unique amino terminal signal peptide, thus each might possess an unknown function; and 3) identify for the first time the atp8 gene in oysters. The present study thus gives further credence to the comparison of congeneric bivalves as a meaningful strategy to investigate mt genomic evolutionary trends in genome organization, tRNA multigene family, and gene loss and/or duplication that are difficult to undertake at higher taxonomic levels. In particular, our study provides new evidence for the identification and characterization of ORFs in the "non-coding region" of animal mt genomes.  相似文献   

11.
A computational analysis of RNA editing sites was performedon protein-coding sequences of plant mitochondrial genomes fromArabidopsis thaliana, Beta vulgaris, Brassica napus, and Oryzasativa. The distribution of nucleotides around edited and uneditedcytidines was compared in 41 nucleotide segments and included1481 edited cytidines and 21,390 unedited cytidines in the 4genomes. The distribution of nucleotides was examined in 1,2, and 3 nucleotide windows by comparison of nucleotide frequencyratios and relative entropy. The relative entropy analyses indicatethat information is encoded in the nucleotide sequences in the5 prime flank (–18 to –14, –13 to –10,–6 to –4, –2/–1) and the immediate 3prime flanking nucleotide (+1), and these regions may be importantin editing site recognition. The relative entropy was largewhen 2 or 3 nucleotide windows were analyzed, suggesting thatseveral contiguous nucleotides may be involved in editing siterecognition. RNA editing sites were frequently preceded by 2pyrimidines or AU and followed by a guanidine (HYCG) in themonocot and dicot mitochondrial genomes, and rarely precededby 2 purines. Analysis of chloroplast editing sites from a dicot,Nicotiana tabacum, and a monocot, Zea mays, revealed a similardistribution of nucleotides around editing sites (HYCA). Thesimilarity of this motif around editing sites in monocots anddicots in both mitochondria and chloroplasts suggests that amechanistic basis for this motif exists that is common in thesedifferent organelle and phylogenetic systems. The preferredsequence distribution around RNA editing sites may have an importantimpact on the acquisition of editing sites in evolution becausethe immediate sequence context of a cytidine residue may rendera cytidine editable or uneditable, and consequently determinewhether a T to C mutation at a specific position may be correctedby RNA editing. The distribution of editing sites in many protein-codingsequences is shown to be non-random with editing sites clusteredin groups separated by regions with no editing sites. The sporadicdistribution of editing sites could result from a mechanismof editing site loss by gene conversion utilizing edited sequenceinformation, possibly through an edited cDNA intermediate.  相似文献   

12.
Zinc metalloproteins are involved in many biological processes and play crucial biochemical roles across all domains of life. Local structure around the zinc ion, especially the coordination geometry (CG), is dictated by the protein sequence and is often directly related to the function of the protein. Current methodologies in characterizing zinc metalloproteins' CG consider only previously reported CG models based mainly on nonbiological chemical context. Exceptions to these canonical CG models are either misclassified or discarded as “outliers.” Thus, we developed a less‐biased method that directly handles potential exceptions without pre‐assuming any CG model. Our study shows that numerous exceptions could actually be further classified and that new CG models are needed to characterize them. Also, these new CG models are cross‐validated by strong correlation between independent structural and functional annotation distance metrics, which is partially lost if these new CGs models are ignored. Furthermore, these new CG models exhibit functional propensities distinct from the canonical CG models. Proteins 2015; 83:1470–1487. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.  相似文献   

13.
Peptidases occur naturally in all organisms and their genes comprise 1–5% of the total number of genes. Genetic, biochemical, and molecular approaches used in recent years led to the identification and characterization of several plant organelle proteases, all of them being homologous to bacterial proteases best characterized in Escherichia coli. Here we report isolating and characterizing three novel genes, namely Sszn-mp1, Sszn-mp2, and SsZn-mp3 from Solanum surattense. To identify the subcellular location, structures, and functions of these three genes, integrated genomic approaches of data mining, expression profiling, and bioinformatic predictions were used. Sszn-mp is found to be constitutively expressed in tissues and regulated by various stimuli. Analysis of eight zinc-metalloproteases (Zn-MPs) deduced or assembled from Arabidopsis thaliana, tomato, potato, cotton, barley, sugarcane, and rice and four Zn-MPs from cyanobacteria (blue-green algae) in the GenBank database reveals that these proteins belong to a novel conserved membrane zinc-metalloprotease family. The plant Zn-MP members share more than 62% overall identity with SsZn-MP3, whereas four putative ATP-dependent zinc-proteases of cyanobacteria have low identity with SsZn-MP3 and their N-termini are about 110 amino acids shorter than those of plant Zn-MPs. The Zn-MP homologous sequences are found neither in other eukaryotic nor prokaryotic databases, suggesting that this family is specific to plants and cyanobacteria. The plant Zn-MP genes encoding membrane proteins are potentially targeted to chloroplast and plasma membranes, and the bacterial Zn-MPs are targeted to the cytoplasmic membrane, and their N-terminal targeting peptides are cleaved off for targeting the mature proteins to their subcellular compartments. The Zn-MP proteins contain a conserved zinc-binding site (HEAGHX19E/DX46∼48EX7E), a potential G-protein coupled receptors family 1 signature, and a triplet motif (N-R/K-F) in plant Zn-MPs, a D/E-R-Y motif in the four bacterial Zn-MPs, suggesting that the different mature forms of Zn-MPs may function as proteases and/or signal receptors. Published in Russian in Fiziologiya Rastenii, 2007, Vol. 54, No. 1, pp. 73–84. The text was submitted by the authors in English.  相似文献   

14.
The genus Caulobacter is found in a variety of habitats and is known for its ability to thrive in low-nutrient conditions. K31 is a novel Caulobacter isolate that has the ability to tolerate copper and chlorophenols, and can grow at 4°C with a doubling time of 40 h. K31 contains a 5.5 Mb chromosome that codes for more than 5500 proteins and two large plasmids (234 and 178 kb) that code for 438 additional proteins. A comparison of the K31 and the Caulobacter crescentus NA1000 genomes revealed extensive rearrangements of gene order, suggesting that the genomes had been randomly scrambled. However, a careful analysis revealed that the distance from the origin of replication was conserved for the majority of the genes and that many of the rearrangements involved inversions that included the origin of replication. On a finer scale, numerous small indels were observed. K31 proteins involved in essential functions shared 80–95% amino acid sequence identity with their C. crescentus homologues, while other homologue pairs tended to have lower levels of identity. In addition, the K31 chromosome contains more than 1600 genes with no homologue in NA1000.  相似文献   

15.
A large set of candidate nucleotide-binding site (NBS)-encoding genes related to disease resistance was identified in the sorghum (Sorghum bicolor) genome. These resistance (R) genes were characterized based on their structural diversity, physical chromosomal location and phylogenetic relationships. Based on their N-terminal motifs and leucine-rich repeats (LRR), 50 non-regular NBS genes and 224 regular NBS genes were identified in 274 candidate NBS genes. The regular NBS genes were classified into ten types: CNL, CN, CNLX, CNX, CNXL, CXN, NX, N, NL and NLX. The vast majority (97%) of NBS genes occurred in gene clusters, indicating extensive gene duplication in the evolution of S. bicolor NBS genes. Analysis of the S. bicolor NBS phylogenetic tree revealed two major clades. Most NBS genes were located at the distal tip of the long arms of the ten sorghum chromosomes, a pattern significantly different from rice and Arabidopsis, the NBS genes of which have a random chromosomal distribution.  相似文献   

16.
17.
18.
19.
The Sakishima islands are members of the Ryukyu island chain, stretching from the southwestern tip of the Japanese archipelago to Taiwan in the East China Sea. Archaeological data indicate cultural similarities between inhabitants of prehistoric Sakishima and Neolithic Taiwan. Recent studies based on tooth crown traits show remarkably high inter‐island diversity among Ryukyu islanders, suggesting that the Sakishima islanders might have genetic backgrounds distinct from main‐island Okinawa people. To investigate the genetic diversity of the Ryukyu islanders, we analyzed mtDNA, Y chromosome, and autosomal short tandem repeat loci in a sample of main‐island Okinawa people and Sakishima (Miyako and Ishigaki) islanders whose participated in a previous study of tooth crown morphology. Our phylogenetic analysis of maternal (mtDNA) and paternal (Y chromosome) lineages shows that the Sakishima islanders are more closely related to people from the Japanese archipelago than to Taiwan aborigines. Miyako islanders and the Hokkaido Ainu have the first and second highest frequencies (respectively) of the Y‐chromosomal Alu‐insertion polymorphism, which is a presumable Jomon marker. Genetic diversity statistics show no evidence of demographic reduction or of extreme isolation in each island's population. Thus, we conclude that 1) Neolithic expansion from Taiwan did not contribute to the gene pool of modern Sakishima islanders, 2) male‐lineage of the Ryukyu islanders likely shares a common ancestor with the Hokkaido Ainu who are presumably direct descendants of the Jomon people, and 3) frequent reciprocal gene flow among islands has masked the trace of common ancestry in the Ryukyu island chain. Am J Phys Anthropol, 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号