首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Based on searches for disabled homologs to known proteins, we have identified a large population of pseudogenes in four sequenced eukaryotic genomes—the worm, yeast, fly and human (chromosomes 21 and 22 only). Each of our nearly 2500 pseudogenes is characterized by one or more disablements mid-domain, such as premature stops and frameshifts. Here, we perform a comprehensive survey of the amino acid and nucleotide composition of these pseudogenes in comparison to that of functional genes and intergenic DNA. We show that pseudogenes invariably have an amino acid composition intermediate between genes and translated intergenic DNA. Although the degree of intermediacy varies among the four organisms, in all cases, it is most evident for amino acid types that differ most in occurrence between genes and intergenic regions. The same intermediacy also applies to codon frequencies, especially in the worm and human. Moreover, the intermediate composition of pseudogenes applies even though the composition of the genes in the four organisms is markedly different, showing a strong correlation with the overall A/T content of the genomic sequence. Pseudogenes can be divided into ‘ancient’ and ‘modern’ subsets, based on the level of sequence identity with their closest matching homolog (within the same genome). Modern pseudogenes usually have a much closer sequence composition to genes than ancient pseudogenes. Collectively, our results indicate that the composition of pseudogenes that are under no selective constraints progressively drifts from that of coding DNA towards non-coding DNA. Therefore, we propose that the degree to which pseudogenes approach a random sequence composition may be useful in dating different sets of pseudogenes, as well as to assess the rate at which intergenic DNA accumulates mutations. Our compositional analyses with the interactive viewer are available over the web at http://genecensus.org/pseudogene.  相似文献   

2.
3.
Your Gene structure Annotation Tool for Eukaryotes (yrGATE) provides an Annotation Tool and Community Utilities for worldwide web-based community genome and gene annotation. Annotators can evaluate gene structure evidence derived from multiple sources to create gene structure annotations. Administrators regulate the acceptance of annotations into published gene sets. yrGATE is designed to facilitate rapid and accurate annotation of emerging genomes as well as to confirm, refine, or correct currently published annotations. yrGATE is highly portable and supports different standard input and output formats. The yrGATE software and usage cases are available at .  相似文献   

4.
Transcriptional repression of eukaryotic promoters   总被引:108,自引:0,他引:108  
M Levine  J L Manley 《Cell》1989,59(3):405-408
  相似文献   

5.
Reversible protein phosphorylation by protein kinases and phosphatases is a ubiquitous signaling mechanism in all eukaryotic cells. A multilevel hidden Markov model library is presented which is able to classify protein kinases into one of 12 families, with a misclassification rate of zero on the characterized kinomes of H. sapiens, M. musculus, D. melanogaster, C. elegans, S. cerevisiae, D. discoideum, and P. falciparum. The Library is shown to outperform BLASTP and a general Pfam hidden Markov model of the kinase catalytic domain in the retrieval and family-level classification of protein kinases. The application of the Library to the 38 unclassified kinases of yeast enriches the yeast kinome in protein kinases of the families AGC (5), CAMK (17), CMGC (4), and STE (1), thereby raising the family-level classification of yeast conventional protein kinases from 66.96 to 90.43%. The application of the Library to 21 eukaryotic genomes shows seven families (AGC, CAMK, CK1, CMGC, STE, PIKK, and RIO) to be present in all genomes analyzed, and so is likely to be essential to eukaryotes. Putative tyrosine kinases (TKs) are found in the plants A. thaliana (2), O. sativa ssp. Indica (6), and O. sativa ssp. Japonica (7), and in the amoeba E. histolytica (7). To our knowledge, TKs have not been predicted in plants before. This also suggests that a primitive set of TKs might have predated the radiation of eukaryotes. Putative tyrosine kinase-like kinases (TKLs) are found in the fungi C. neoformans (2), P. chrysosporium (4), in the Apicomplexans C. hominis (4), P. yoelii (4), and P. falciparum (6), the amoeba E. histolytica (109), and the alga T. pseudonana (6). TKLs are found to be abundant in plants (776 in A. thaliana, 1010 in O. sativa ssp. Indica, and 969 in O. sativa ssp. Japonica). TKLs might have predated the radiation of eukaryotes too and have been lost secondarily from some fungi. The application of the Library facilitates the annotation of kinomes and has provided novel insights on the early evolution and subsequent adaptations of the various protein kinase families in eukaryotes.  相似文献   

6.
7.
Tetracycline-reversible silencing of eukaryotic promoters.   总被引:12,自引:1,他引:11       下载免费PDF全文
  相似文献   

8.
9.
The falling cost of genome sequencing is having a marked impact on the research community with respect to which genomes are sequenced and how and where they are annotated. Genome annotation projects have generally become small-scale affairs that are often carried out by an individual laboratory. Although annotating a eukaryotic genome assembly is now within the reach of non-experts, it remains a challenging task. Here we provide an overview of the genome annotation process and the available tools and describe some best-practice approaches.  相似文献   

10.
The annotation of protein function at genomic scale is essential for day-to-day work in biology and for any systematic approach to the modeling of biological systems. Currently, functional annotation is essentially based on the expansion of the relatively small number of experimentally determined functions to large collections of proteins. The task of systematic annotation faces formidable practical problems related to the accuracy of the input experimental information, the reliability of current systems for transferring information between related sequences, and the reproducibility of the links between database information and the original experiments reported in publications. These technical difficulties merely lie on the surface of the deeper problem of the evolution of protein function in the context of protein sequences and structures. Given the mixture of technical and scientific challenges, it is not surprising that errors are introduced, and expanded, in database annotations. In this situation, a more realistic option is the development of a reliability index for database annotations, instead of depending exclusively on efforts to correct databases. Several groups have attempted to compare the database annotations of similar proteins, which constitutes the first steps toward the calibration of the relationship between sequence and annotation space.  相似文献   

11.
Models for prediction and recognition of eukaryotic promoters   总被引:13,自引:0,他引:13  
  相似文献   

12.
《Gene》1998,212(2):259-268
Mammalian pancreatic ribonucleases (RNase) form a family of extensively studied homologous proteins. Phylogenetic analyses, based on the primary structures of these enzymes, indicated that the presence of three homologous enzymes (pancreatic, seminal and brain ribonucleases) in the bovine species is due to gene duplication events, which occurred during the evolution of ancestral ruminants. In this paper the sequences are reported of the coding regions of the orthologues of the three bovine secretory ribonucleases in hog deer and roe deer, two deer species belonging to two different subfamilies of the family Cervidae. The sequences of the 3′ untranslated regions of the three different secretory RNase genes of these two deer species and giraffe are also presented. Comparison of these and previously determined sequences of ruminant ribonucleases showed that the brain-type enzymes of giraffe and these deer species exhibit variations in their C-terminal extensions. The seminal-type genes of giraffe, hog deer and roe deer show all the features of pseudogenes. Phylogenetic analyses, based on the complete coding regions and parts of the 3′ untranslated regions of the three different secretory ribonuclease genes of ox, sheep, giraffe and the two deer species, show that pancreatic, seminal- and brain-type RNases form three separate groups.  相似文献   

13.
CAT vectors for analysis of eukaryotic promoters and enhancers   总被引:36,自引:0,他引:36  
E Prost  D D Moore 《Gene》1986,45(1):107-111
We have constructed two sets of plasmids for analysis of factors affecting mammalian gene expression. The pOCAT series contains a bacterial chloramphenicol-resistance expression unit (cat) and no eukaryotic promoter. The pUTKAT series contains the same cat unit under the control of the thymidine-kinase promoter of Herpes simplex virus. These plasmids are designed for testing effects of inserted regulatory elements on cat expression after transient transfection of mammalian cells in culture. We demonstrate here that the pOCAT series is useful for studying activities of inserted eukaryotic promoters, and the pUTKAT series is useful for studying activities of inserted eukaryotic enhancers.  相似文献   

14.
Structural relationship of human interferon alpha genes and pseudogenes   总被引:17,自引:0,他引:17  
We have isolated and characterized DNA segments containing IFN-alpha-related sequences from human lambda and cosmid clone banks. We describe six linkage groups comprising 18 distinct IFN-alpha-related loci, and report the nucleotide sequences of nine chromosomal IFN-alpha-genes with intact reading frames, as well as of five pseudogenes. Taking into account as yet unsequenced genes as well as clones described by others, there are now seven linkage groups and 23 loci, of which 15 correspond to potentially functional genes and six to non-functional genes; two loci remain unsequenced. Eighteen additional sequences are likely to be allelic to the above. The finding that at least two IFN-alpha genes appear to be natural hybrids of other IFN-alpha genes, and that two distinct IFN-alpha loci have completely identical coding sequences, although their flanking regions are different, is evidence for information exchange between the individual genes.  相似文献   

15.
Patterns of nucleotide substitution in pseudogenes and functional genes   总被引:26,自引:0,他引:26  
Summary The pattern of point mutations is inferred from nucleotide substitutions in pseudogenes. The pattern obtained suggests that transition mutations occur somewhat more frequently than transversion mutations and that mutations result more often in A or T than in G or C. Our results are discussed with respect to the predictions from Topal and Fresco's model for the molecular basis of point (substitution) mutations (Nature 263:285–289, 1976). The pattern of nucleotide substitution at the first and second positions of codons in functional genes is quite similar to that in pseudogenes, but the relative frequency of the transition CT in the sense strand is drastically reduced and those of the transversions CG and GC are doubled. The differences between the two patterns can be explained by the observation that in the protein evolution amino acid substitutions occur mainly between amino acids with similar biochemical properties (Grantham, Science 185:862–864, 1974). Our results for the patterns of nucleotide substitutions in pseudogenes and in functional genes lead to the prediction that both the coding and non-coding regions of protein coding genes should have high frequencies of A and T. Available data show that the non-coding regions are indeed high in A and T but the coding regions are low in T, though high in A.  相似文献   

16.
The enzymes of the GCN5-related N-acetyltransferase (GNAT) superfamily count more than 870 000 members through all kingdoms of life and share the same structural fold. GNAT enzymes transfer an acyl moiety from acyl coenzyme A to a wide range of substrates including aminoglycosides, serotonin, glucosamine-6-phosphate, protein N-termini and lysine residues of histones and other proteins. The GNAT subtype of protein N-terminal acetyltransferases (NATs) alone targets a majority of all eukaryotic proteins stressing the omnipresence of the GNAT enzymes. Despite the highly conserved GNAT fold, sequence similarity is quite low between members of this superfamily even when substrates are similar. Furthermore, this superfamily is phylogenetically not well characterized. Thus functional annotation based on sequence similarity is unreliable and strongly hampered for thousands of GNAT members that remain biochemically uncharacterized. Here we used sequence similarity networks to map the sequence space and propose a new classification for eukaryotic GNAT acetyltransferases. Using the new classification, we built a phylogenetic tree, representing the entire GNAT acetyltransferase superfamily. Our results show that protein NATs have evolved more than once on the GNAT acetylation scaffold. We use our classification to predict the function of uncharacterized sequences and verify by in vitro protein assays that two fungal genes encode NAT enzymes targeting specific protein N-terminal sequences, showing that even slight changes on the GNAT fold can lead to change in substrate specificity. In addition to providing a new map of the relationship between eukaryotic acetyltransferases the classification proposed constitutes a tool to improve functional annotation of GNAT acetyltransferases.  相似文献   

17.
C Gao  M Xiao  X Ren  A Hayward  J Yin  L Wu  D Fu  J Li 《Genomics》2012,100(4):222-230
The movement of transposable elements (TE) in eukaryotic genomes can often result in the occurrence of nested TEs (the insertion of TEs into pre-existing TEs). We performed a general TE assessment using available databases to detect nested TEs and analyze their characteristics and putative functions in eukaryote genomes. A total of 802 TEs were found to be inserted into 690 host TEs from a total number of 11,329 TEs. We reveal that repetitive sequences are associated with an increased occurrence of nested TEs and sequence biased of TE insertion. A high proportion of the genes which were associated with nested TEs are predicted to localize to organelles and participate in nucleic acid and protein binding. Many of these function in metabolic processes, and encode important enzymes for transposition and integration. Therefore, nested TEs in eukaryotic genomes may negatively influence genome expansion, and enrich the diversity of gene expression or regulation.  相似文献   

18.
19.
20.
Automatic annotation of organellar genomes with DOGMA   总被引:17,自引:0,他引:17  
The Dual Organellar GenoMe Annotator (DOGMA) automates the annotation of organellar (plant chloroplast and animal mitochondrial) genomes. It is a Web-based package that allows the use of BLAST searches against a custom database, and conservation of basepairing in the secondary structure of animal mitochondrial tRNAs to identify and annotate genes. DOGMA provides a graphical user interface for viewing and editing annotations. Annotations are stored on our password-protected server to enable repeated sessions of working on the same genome. Finished annotations can be extracted for direct submission to GenBank.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号