首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Gene duplication and gene loss during the evolution of eukaryotes have hindered attempts to estimate phylogenies and divergence times of species. Although current methods that identify clusters of orthologous genes in complete genomes have helped to investigate gene function and gene content, they have not been optimized for evolutionary sequence analyses requiring strict orthology and complete gene matrices. Here we adopt a relatively simple and fast genome comparison approach designed to assemble orthologs for evolutionary analysis. Our approach identifies single-copy genes representing only species divergences (panorthologs) in order to minimize potential errors caused by gene duplication. We apply this approach to complete sets of proteins from published eukaryote genomes specifically for phylogeny and time estimation.  相似文献   

2.
Bornaviruses are the only animal RNA viruses that establish a persistent infection in their host cell nucleus. Studies of bornaviruses have provided unique information about viral replication strategies and virus–host interactions. Although bornaviruses do not integrate into the host genome during their replication cycle, we and others have recently reported that there are DNA sequences derived from the mRNAs of ancient bornaviruses in the genomes of vertebrates, including humans, and these have been designated endogenous borna-like (EBL) elements. Therefore, bornaviruses have been interacting with their hosts as driving forces in the evolution of host genomes in a previously unexpected way. Studies of EBL elements have provided new models for virology, evolutionary biology and general cell biology. In this review, we summarize the data on EBL elements including what we have newly identified in eukaryotes genomes, and discuss the biological significance of EBL elements, with a focus on EBL nucleoprotein elements in mammalian genomes. Surprisingly, EBL elements were detected in the genomes of invertebrates, suggesting that the host range of bornaviruses may be much wider than previously thought. We also review our new data on non-retroviral integration of Borna disease virus.  相似文献   

3.
Complete eukaryote chromosomes were investigated for intrachromosomal duplications of nucleotide sequences. The analysis was performed by looking for nonexact repeats on two complete genomes, Saccharomyces cerevisiae and Caenorhabditis elegans, and four partial ones, Drosophila melanogaster, Plasmodium falciparum, Arabidopsis thaliana, and Homo sapiens. Through this analysis, we show that all eukaryote chromosomes exhibit similar characteristics for their intrachromosomal repeats, suggesting similar dynamics: many direct repeats have their two copies physically close together, and these close direct repeats are more similar and shorter than the other repeats. On the contrary, there are almost no close inverted repeats. These results support a model for the dynamics of duplication. This model is based on a continuous genesis of tandem repeats and implies that most of the distant and inverted repeats originate from these tandem repeats by further chromosomal rearrangements (insertions, inversions, and deletions). Remnants of these predicted rearrangements have been brought out through fine analysis of the chromosome sequence. Despite these dynamics, shared by all eukaryotes, each genome exhibits its own style of intrachromosomal duplication: the density of repeated elements is similar in all chromosomes issued from the same genome, but is different between species. This density was further related to the relative rates of duplication, deletion, and mutation proper to each species. One should notice that the density of repeats in the X chromosome of C. elegans is much lower than in the autosomes of that organism, suggesting that the exchange between homologous chromosomes is important in the duplication process.  相似文献   

4.
An extreme level of DNA sequence polymorphism, the basis of DNA fingerprinting, was first demonstrated using genome derived cloned probes. Subsequently, it was shown that DNA fingerprinting can also be carried out using short synthetic oligodeoxyribonucleotide probes specific for simple repetitive sequences. Further, in addition to radioactively labeled probes, non-radioactive oligonucleotides generate equally informative hybridization patterns. We discuss the development in the area of DNA fingerprinting and its future scope with respect to plant, animal and the human DNA.  相似文献   

5.
6.
7.
Understanding regulatory mechanisms of protein synthesis in eukaryotes is essential for the accurate annotation of genome sequences. Kozak reported that the nucleotide sequence GCCGCC(A/G)CCAUGG (AUG is the initiation codon) was frequently observed in vertebrate genes and that this 'consensus' sequence enhanced translation initiation. However, later studies using invertebrate, fungal and plant genes reported different 'consensus' sequences. In this study, we conducted extensive comparative analyses of nucleotide sequences around the initiation codon by using genomic data from 47 eukaryote species including animals, fungi, plants and protists. The analyses revealed that preferred nucleotide sequences are quite diverse among different species, but differences between patterns of nucleotide bias roughly reflect the evolutionary relationships of the species. We also found strong biases of A/G at position -3, A/C at position -2 and C at position +5 that were commonly observed in all species examined. Genes with higher expression levels showed stronger signals, suggesting that these nucleotides are responsible for the regulation of translation initiation. The diversity of preferred nucleotide sequences around the initiation codon might be explained by differences in relative contributions from two distinct patterns, GCCGCCAUG and AAAAAAAUG, which implies the presence of multiple molecular mechanisms for controlling translation initiation.  相似文献   

8.
Large-scale expression data are today measured for several thousands of genes simultaneously. Furthermore, most genes are being categorized according to their properties. This development has been followed by an exploration of theoretical tools to integrate these diverse data types. A key problem is the large noise-level in the data. Here, we investigate ways to extract the remaining signals within these noisy data sets. We find large-scale correlations within data from Saccharomyces cerevisiae with respect to properties of the encoded proteins. These correlations are visualized in a way that is robust to the underlying noise in the measurement of the individual gene expressions. In particular, for S. cerevisiae we observe that the proteins corresponding to the 400 highest expressed genes typically are localized to the cytoplasm. These most expressed genes are not essential for cell survival.  相似文献   

9.
Study of statistical correlations in DNA sequences   总被引:3,自引:0,他引:3  
Here we present a study of statistical correlations among different positions in DNA sequences and their implications by directly using the autocorrelation function. Such an analysis is possible now because of the availability of large sequences or even complete genomes of many organisms. After describing the way in which the autocorrelation function can be applied to DNA-sequence analysis, we show that long-range correlations, implying scale independence, appear in several bacterial genomes as well as in long human chromosome contigs. The source for such correlations in bacteria, which may extend up to 60 kb in Bacillus subtilis, may be related to massive lateral transfer of compositionally biased genes from other genomes. In the human genome, correlations extend for more than five decades and may be related to the evolution of the ’neogenome’, a modern evolutionary acquisition composed by GC-rich isochores displaying long-range correlations and scale invariance.  相似文献   

10.
11.
Adenine nucleotides have been found to appear preferentially in the regions after the initiation codons or before the termination codons of bacterial genes. Our previous experiments showed that AAA and AAT, the two most frequent second codons in Escherichia coli, significantly enhance translation efficiency. To determine whether such a characteristic feature of base frequencies exists in eukaryote genes, we performed a comparative analysis of the base biases at the gene terminal portions using the proteomes of seven eukaryotes. Here we show that the base appearance at the codon third positions of gene terminal regions is highly biased in eukaryote genomes, although the codon third positions are almost free from amino acid preference. The bias changes depending on its position in a gene, and is characteristic of each species. We also found that bias is most outstanding at the second codon, the codon after the initiation codon. NCN is preferred in every genome; in particular, GCG is strongly favored in human and plant genes. The presence of the bias implies that the base sequences at the second codon affect translation efficiency in eukaryotes as well as bacteria.  相似文献   

12.
Assessing how natural environmental drivers affect biodiversity underpins our understanding of the relationships between complex biotic and ecological factors in natural ecosystems. Of all ecosystems, anthropogenically important estuaries represent a ‘melting pot'' of environmental stressors, typified by extreme salinity variations and associated biological complexity. Although existing models attempt to predict macroorganismal diversity over estuarine salinity gradients, attempts to model microbial biodiversity are limited for eukaryotes. Although diatoms commonly feature as bioindicator species, additional microbial eukaryotes represent a huge resource for assessing ecosystem health. Of these, meiofaunal communities may represent the optimal compromise between functional diversity that can be assessed using morphology and phenotype–environment interactions as compared with smaller life fractions. Here, using 454 Roche sequencing of the 18S nSSU barcode we investigate which of the local natural drivers are most strongly associated with microbial metazoan and sampled protist diversity across the full salinity gradient of the estuarine ecosystem. In order to investigate potential variation at the ecosystem scale, we compare two geographically proximate estuaries (Thames and Mersey, UK) with contrasting histories of anthropogenic stress. The data show that although community turnover is likely to be predictable, taxa are likely to respond to different environmental drivers and, in particular, hydrodynamics, salinity range and granulometry, according to varied life-history characteristics. At the ecosystem level, communities exhibited patterns of estuary-specific similarity within different salinity range habitats, highlighting the environmental sequencing biomonitoring potential of meiofauna, dispersal effects or both.  相似文献   

13.
A glycosylphosphatidylinositol (GPI) anchor is a common but complex C-terminal post-translational modification of extracellular proteins in eukaryotes. Here we investigate the problem of correctly annotating GPI-anchored proteins for the growing number of sequences in public databases. We developed a computational system, called FragAnchor, based on the tandem use of a neural network (NN) and a hidden Markov model (HMM). Firstly, NN selects potential GPI-anchored proteins in a dataset, then HMM parses these potential GPI signals and refines the prediction by qualitative scoring. FragAnchor correctly predicted 91% of all the GPI-anchored proteins annotated in the Swiss-Prot database. In a large-scale analysis of 29 eukaryote proteomes, FragAnchor predicted that the percentage of highly probable GPI-anchored proteins is between 0.21% and 2.01%. The distinctive feature of FragAnchor, compared with other systems, is that it targets only the C-terminus of a protein, making it less sensitive to the background noise found in databases and possible incomplete protein sequences. Moreover, FragAnchor can be used to predict GPI-anchored proteins in all eukaryotes. Finally, by using qualitative scoring, the predictions combine both sensitivity and information content. The predictor is publicly available at http://navet .ics.hawaii.edu/- fraganchor/NNHMM/NNHMM.ht ml.  相似文献   

14.
MOTIVATION: The recent discovery of the first small modulatory RNA (smRNA) presents the challenge of finding other molecules of similar length and conservation level. Unlike short interfering RNA (siRNA) and micro-RNA (miRNA), effective computational and experimental screening methods are not currently known for this species of RNA molecule, and the discovery of the one known example was partly fortuitous because it happened to be complementary to a well-studied DNA binding motif (the Neuron Restrictive Silencer Element). RESULTS: The existing comparative genomics approaches (e.g., phylogenetic footprinting) rely on alignments of orthologous regions across multiple genomes. This approach, while extremely valuable, is not suitable for finding motifs with highly diverged "non-alignable" flanking regions. Here we show that several unusually long and well conserved motifs can be discovered de novo through a comparative genomics approach that does not require an alignment of orthologous upstream regions. These motifs, including Neuron Restrictive Silencer Element, were missed in recent comparative genomics studies that rely on phylogenetic footprinting. While the functions of these motifs remain unknown, we argue that some may represent biologically important sites. AVAILABILITY: Our comparative genomics software, a web-accessible database of our results and a compilation of experimentally validated binding sites for NRSE can be found at http://www.cse.ucsd.edu/groups/bioinformatics.  相似文献   

15.
16.
Identifying large-scale structural variation in cancer genomes continues to be a challenge to researchers. Current methods rely on genome alignments based on a reference that can be a poor fit to highly variant and complex tumor genomes. To address this challenge we developed a method that uses available breakpoint information to generate models of structural variations. We use these models as references to align previously unmapped and discordant reads from a genome. By using these models to align unmapped reads, we show that our method can help to identify large-scale variations that have been previously missed.  相似文献   

17.
Neustonic organisms inhabit the sea surface microlayer (SML) and have important roles in marine ecosystem functioning. Here, we use high‐throughput 18S rRNA gene sequencing to characterize protist and fungal diversity in the SML at a coastal time‐series station and compare with underlying plankton assemblages. Protist diversity was higher in February (pre‐bloom) compared to April (spring bloom), and was lower in the neuston than in the plankton. Major protist groups, including Stramenopiles and Alveolata, dominated both neuston and plankton assemblages. Chrysophytes and diatoms were enriched in the neuston in April, with diatoms showing distinct changes in community composition between the sampling periods. Pezizomycetes dominated planktonic fungi assemblages, whereas fungal diversity in the neuston was more varied. This is the first study to utilize a molecular‐based approach to characterize neustonic protist and fungal assemblages, and provides the most comprehensive diversity assessment to date of this ecosystem. Variability in the SML microeukaryote assemblage structure has potential implications for biogeochemical and food web processes at the air‐sea interface.  相似文献   

18.
19.
Han L  Su B  Li WH  Zhao Z 《Genome biology》2008,9(5):R79

Background  

CpG islands, which are clusters of CpG dinucleotides in GC-rich regions, are considered gene markers and represent an important feature of mammalian genomes. Previous studies of CpG islands have largely been on specific loci or within one genome. To date, there seems to be no comparative analysis of CpG islands and their density at the DNA sequence level among mammalian genomes and of their correlations with other genome features.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号