首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The Genome Annotation Assessment Project tested current methods of gene identification, including a critical assessment of the accuracy of different methods. Two new databases have provided new resources for gene annotation: these are the InterPro database of protein domains and motifs, and the Gene Ontology database for terms that describe the molecular functions and biological roles of gene products. Efforts in genome annotation are most often based upon advances in computer systems that are specifically designed to deal with the tremendous amounts of data being generated by current sequencing projects. These efforts in analysis are being linked to new ways of visualizing computationally annotated genomes.  相似文献   

2.
Several eukaryotic genomes have been completely sequenced and this provides an opportunity to investigate the extent and characteristics (e.g., single gene duplication, block duplication, etc.) of gene duplication in a genome. Detecting duplicate genes in a genome, however, is not a simple problem because of several complications such as domain shuffling, the existence of isoforms derived from alternative splicing, and annotational errors in the databases. We describe a method for overcoming these difficulties and the extents of gene duplication in the genomes of Drosophila melanogaster, Caenorhabditis elegans, and yeast inferred from this method. We also describe a method for detecting block duplications in a genome. Application of this method showed that block duplication is a common phenomenon in both yeast and nematode. The patterns of block duplication in the two species are, however, markedly different. Yeast shows much more extensive block duplication than nematode, with some chromosomes having more than 40% of the duplications derived from block duplications. Moreover, in yeast the majority of block duplications occurred between chromosomes, while in nematode most block duplications occurred within chromosomes.  相似文献   

3.

Background  

Gene duplication and gene loss during the evolution of eukaryotes have hindered attempts to estimate phylogenies and divergence times of species. Although current methods that identify clusters of orthologous genes in complete genomes have helped to investigate gene function and gene content, they have not been optimized for evolutionary sequence analyses requiring strict orthology and complete gene matrices. Here we adopt a relatively simple and fast genome comparison approach designed to assemble orthologs for evolutionary analysis. Our approach identifies single-copy genes representing only species divergences (panorthologs) in order to minimize potential errors caused by gene duplication. We apply this approach to complete sets of proteins from published eukaryote genomes specifically for phylogeny and time estimation.  相似文献   

4.
5.
6.
An extreme level of DNA sequence polymorphism, the basis of DNA fingerprinting, was first demonstrated using genome derived cloned probes. Subsequently, it was shown that DNA fingerprinting can also be carried out using short synthetic oligodeoxyribonucleotide probes specific for simple repetitive sequences. Further, in addition to radioactively labeled probes, non-radioactive oligonucleotides generate equally informative hybridization patterns. We discuss the development in the area of DNA fingerprinting and its future scope with respect to plant, animal and the human DNA.  相似文献   

7.
Bornaviruses are the only animal RNA viruses that establish a persistent infection in their host cell nucleus. Studies of bornaviruses have provided unique information about viral replication strategies and virus–host interactions. Although bornaviruses do not integrate into the host genome during their replication cycle, we and others have recently reported that there are DNA sequences derived from the mRNAs of ancient bornaviruses in the genomes of vertebrates, including humans, and these have been designated endogenous borna-like (EBL) elements. Therefore, bornaviruses have been interacting with their hosts as driving forces in the evolution of host genomes in a previously unexpected way. Studies of EBL elements have provided new models for virology, evolutionary biology and general cell biology. In this review, we summarize the data on EBL elements including what we have newly identified in eukaryotes genomes, and discuss the biological significance of EBL elements, with a focus on EBL nucleoprotein elements in mammalian genomes. Surprisingly, EBL elements were detected in the genomes of invertebrates, suggesting that the host range of bornaviruses may be much wider than previously thought. We also review our new data on non-retroviral integration of Borna disease virus.  相似文献   

8.

Background

Duplications of stretches of the genome are an important source of individual genetic variation, but their unrecognized presence in laboratory organisms would be a confounding variable for genetic analysis.

Results

We report here that duplications of 15 kb or more are common in the genome of the social amoeba Dictyostelium discoideum. Most stocks of the axenic 'workhorse' strains Ax2 and Ax3/4 obtained from different laboratories can be expected to carry different duplications. The auxotrophic strains DH1 and JH10 also bear previously unreported duplications. Strain Ax3/4 is known to carry a large duplication on chromosome 2 and this structure shows evidence of continuing instability; we find a further variable duplication on chromosome 5. These duplications are lacking in Ax2, which has instead a small duplication on chromosome 1. Stocks of the type isolate NC4 are similarly variable, though we have identified some approximating the assumed ancestral genotype. More recent wild-type isolates are almost without large duplications, but we can identify small deletions or regions of high divergence, possibly reflecting responses to local selective pressures. Duplications are scattered through most of the genome, and can be stable enough to reconstruct genealogies spanning decades of the history of the NC4 lineage. The expression level of many duplicated genes is increased with dosage, but for others it appears that some form of dosage compensation occurs.

Conclusion

The genetic variation described here must underlie some of the phenotypic variation observed between strains from different laboratories. We suggest courses of action to alleviate the problem.  相似文献   

9.
10.
Understanding regulatory mechanisms of protein synthesis in eukaryotes is essential for the accurate annotation of genome sequences. Kozak reported that the nucleotide sequence GCCGCC(A/G)CCAUGG (AUG is the initiation codon) was frequently observed in vertebrate genes and that this 'consensus' sequence enhanced translation initiation. However, later studies using invertebrate, fungal and plant genes reported different 'consensus' sequences. In this study, we conducted extensive comparative analyses of nucleotide sequences around the initiation codon by using genomic data from 47 eukaryote species including animals, fungi, plants and protists. The analyses revealed that preferred nucleotide sequences are quite diverse among different species, but differences between patterns of nucleotide bias roughly reflect the evolutionary relationships of the species. We also found strong biases of A/G at position -3, A/C at position -2 and C at position +5 that were commonly observed in all species examined. Genes with higher expression levels showed stronger signals, suggesting that these nucleotides are responsible for the regulation of translation initiation. The diversity of preferred nucleotide sequences around the initiation codon might be explained by differences in relative contributions from two distinct patterns, GCCGCCAUG and AAAAAAAUG, which implies the presence of multiple molecular mechanisms for controlling translation initiation.  相似文献   

11.
The complete genome of the yeast Saccharomyces cerevisiae was investigated for intrachromosomal duplications at the level of nucleotide sequences. The analysis was performed by looking for long approximate repeats (from 30 to 3,885 bp) present on each of the chromosomes. We show that direct and inverted repeats exhibit very different characteristics: the two copies of direct repeats are more similar and longer than those of inverted repeats. Furthermore, contrary to the inverted repeats, a large majority of direct repeats appear to be closely spaced. The distance (delta) between the two copies is generally smaller than 1 kb. Further analysis of these "close direct repeats" shows a negative correlation between delta and the percentage of identity between the two copies, and a positive correlation between delta and repeat length. Moreover, contrary to the other categories of repeats, close direct repeats are mostly located within coding sequences (CDSs). We propose two hypotheses in order to interpret these observations: first, the deletion/conversion rate is negatively correlated with delta; second, there exists an active duplication mechanism which continuously creates close direct repeats, the other intrachromosomal repeats being the result, by chromosomal rearrangements of these "primary repeats."  相似文献   

12.
Dehnert M  Helm WE  Hütt MT 《Gene》2005,345(1):81-90
We study short-range correlations in DNA sequences with methods from information theory and statistics. We find a persisting degree of identity between the correlation patterns of different chromosomes of a species. Except for the case of human and chimpanzee inter-species differences in this correlation pattern allow robust species distinction: in a clustering tree based upon the correlation curves on the level of individual chromosomes distinct clusters for the individual species are found. This capacity of distinguishing species persists, even when the length of the underlying sequences is drastically reduced. In comparison to the standard tool for studying symbol correlations in DNA sequences, namely the mutual information function, we find that an autoregressive model for higher order Markov processes significantly improves species distinction due to an implicit subtraction of random background.  相似文献   

13.
Recent genome size estimates for Arctic amphipods have revealed the largest genomes known in the Crustacea. Here we provide additional data for 7 species of caridean shrimp collected from the Canadian Arctic and the Gulf of St. Lawrence. Genome sizes were estimated by flow cytometry and haploid C-values ranged from 8.53 +/- 0.30 pg in Pandalus montagui (Pandalidae) to 40.89 +/- 1.23 pg in Sclerocrangon ferox (Crangonidae). The value for S. ferox represents the largest decapod genome yet recorded and indicates a 38-fold variation in genome size within this order. These data suggest that large genomes may be relatively common in Arctic crustaceans, and underline the need for further comparative studies.  相似文献   

14.
Segmental duplications (SDs) are a major element of eukaryotic genomes. Whereas their quantitative importance vary among lineages, SDs appear as a fundamental trait of the recent evolution of great-apes genomes. The chromosomal instability generated by these SDs has dramatic consequences both in generating a high level of polymorphisms among individuals and in originating numerous human pathogenic diseases. However, even though the importance of SDs has been increasingly recognized at the genomic level, some of the molecular pathways that lead to their formation remain obscure. Here we review recent evidences that the interplay between several mechanisms, some conservative, some based on replication, explains the complex SDs patterns observed in many genomes. Recent experimental studies have indeed partially unveiled some important aspects of these mechanisms, shedding interesting and unsuspected new lights on the dramatic plasticity of eukaryotic genomes. To cite this article: R. Koszul, G. Fischer, C. R. Biologies 332 (2009).  相似文献   

15.
In this paper, we are interested in the computational complexity of computing (dis)similarity measures between two genomes when they contain duplicated genes or genomic markers, a problem that happens frequently when comparing whole nuclear genomes. Recently, several methods ( [1], [2]) have been proposed that are based on two steps to compute a given (dis)similarity measure M between two genomes G_1 and G_2: first, one establishes a oneto- one correspondence between genes of G_1 and genes of G_2 ; second, once this correspondence is established, it defines explicitly a permutation and it is then possible to quantify their similarity using classical measures defined for permutations, like the number of breakpoints. Hence these methods rely on two elements: a way to establish a one-to-one correspondence between genes of a pair of genomes, and a (dis)similarity measure for permutations. The problem is then, given a (dis)similarity measure for permutations, to compute a correspondence that defines an optimal permutation for this measure. We are interested here in two models to compute a one-to-one correspondence: the exemplar model, where all but one copy are deleted in both genomes for each gene family, and the matching model, that computes a maximal correspondence for each gene family. We show that for these two models, and for three (dis)similarity measures on permutations, namely the number of common intervals, the maximum adjacency disruption (MAD) number and the summed adjacency disruption (SAD) number, the problem of computing an optimal correspondence is NP-complete, and even APXhard for the MAD number and SAD number.  相似文献   

16.
Adenine nucleotides have been found to appear preferentially in the regions after the initiation codons or before the termination codons of bacterial genes. Our previous experiments showed that AAA and AAT, the two most frequent second codons in Escherichia coli, significantly enhance translation efficiency. To determine whether such a characteristic feature of base frequencies exists in eukaryote genes, we performed a comparative analysis of the base biases at the gene terminal portions using the proteomes of seven eukaryotes. Here we show that the base appearance at the codon third positions of gene terminal regions is highly biased in eukaryote genomes, although the codon third positions are almost free from amino acid preference. The bias changes depending on its position in a gene, and is characteristic of each species. We also found that bias is most outstanding at the second codon, the codon after the initiation codon. NCN is preferred in every genome; in particular, GCG is strongly favored in human and plant genes. The presence of the bias implies that the base sequences at the second codon affect translation efficiency in eukaryotes as well as bacteria.  相似文献   

17.
The DNAs of different members of the Papillomavirus genus of papovaviruses were analyzed for nucleotide sequence homology. Under standard hybridization conditions (Tm - 28 degrees C), no homology was detectable among the genomes of human papillomavirus type 1 (HPV-1), bovine papillomavirus type 2 (BPV-2), or cottontail rabbit (Shope) papillomavirus (CRPV). However, under less stringent conditions (i.e., Tm - 43 degrees C), stable hybrids were formed between radiolabeled DNAs of CRPV, BPV-1, or BPV-2 and the HindIII-HpaI A, B, and C fragments of HPV-1. Under these same conditions, radiolabeled CRPV and HPV-1 DNAs formed stable hybrids with HincII B and C fragments of BPV-2 DNA. These results indicate that there are regions of homology with as much as 70% base match among all these papillomavirus genomes. Furthermore, unlabeled HPV-1 DNA competitively inhibited the specific hybridization of radiolabeled CRPV DNA to bpv-2 DNA fragments, indicating that the homologous DNA segments are common among these remotely related papillomavirus genomes. These conserved sequences are specific for the Papillomavirus genus of papovaviruses as evidenced by the lack of hybridization between HPV-1 DNA and either simian virus 40 or human papovavirus BK DNA under identical conditions. These results indicate a close evolutionary relationship among the papillomaviruses and further establish the papillomaviruses and polyoma viruses as distinct genera.  相似文献   

18.

Background  

Segmental duplications, or low-copy repeats, are common in mammalian genomes. In the human genome, most segmental duplications are mosaics comprised of multiple duplicated fragments. This complex genomic organization complicates analysis of the evolutionary history of these sequences. One model proposed to explain this mosaic patterns is a model of repeated aggregation and subsequent duplication of genomic sequences.  相似文献   

19.
The genetic relationships of A genomes of Triticum urartu (Au) and Triticum monococcum (Am) in polyploid wheats are explored and quantified by AFLP fingerprinting. Forty-one accessions of A-genome diploid wheats, 3 of AG-genome wheats, 19 of AB-genome wheats, 15 of ABD-genome wheats, and 1 of the D-genome donor Ae. tauschii have been analysed. Based on 7 AFLP primer combinations, 423 bands were identified as potentially A genome specific. The bands were reduced to 239 by eliminating those present in autoradiograms of Ae. tauschii, bands interpreted as common to all wheat genomes. Neighbour-joining analysis separates T. urartu from T. monococcum. Triticum urartu has the closest relationship to polyploid wheats. Triticum turgidum subsp. dicoccum and T. turgidum subsp. durum lines are included in tightly linked clusters. The hexaploid spelts occupy positions in the phylogenetic tree intermediate between bread wheats and T. turgidum. The AG-genome accessions cluster in a position quite distant from both diploid and other polyploid wheats. The estimates of similarity between A genomes of diploid and polyploid wheats indicate that, compared with Am, Au has around 20% higher similarity to the genomes of polyploid wheats. Triticum timo pheevii AG genome is molecularly equidistant from those of Au and Am wheats.  相似文献   

20.
Wide distribution of short interspersed elements among eukaryotic genomes.   总被引:7,自引:0,他引:7  
Most short interspersed elements (SINEs) in eukaryotic genomes originate from tRNA and have internal promoters for RNA polymerase III. The promoter contains two boxes (A and B) spaced by approximately 33 bp. We used oligonucleotide primers specific to these boxes to detect SINEs in the genomic DNA by polymerase chain reaction (PCR). Appropriate DNA fragments were revealed by PCR in 30 out of 35 eukaryotic species suggesting the wide distribution of SINEs. The PCR products were used for hybridization screening of genomic libraries which resulted in identification of four novel SINE families. The application of this approach is illustrated by discovery of a SINE family in the genome of the bat Myotis daubentoni. Members of this SINE family termed VES have an additional B-like box, a putative polyadenylation signal and RNA polymerase III terminator.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号