首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background  

To meet the needs of gene annotation for newly sequenced organisms, optimized spaced seeds can be implemented into cross-species sequence alignment programs to accurately align gene sequences to the genome of a related species. So far, seed performance has been tested for comparisons between closely related species, such as human and mouse, or on simulated data. As the number and variety of genomes increases, it becomes desirable to identify a small set of universal seeds that perform optimally or near-optimally on a large range of comparisons.  相似文献   

2.

Background  

To date, most fungal phylogenies have been derived from single gene comparisons, or from concatenated alignments of a small number of genes. The increase in fungal genome sequencing presents an opportunity to reconstruct evolutionary events using entire genomes. As a tool for future comparative, phylogenomic and phylogenetic studies, we used both supertrees and concatenated alignments to infer relationships between 42 species of fungi for which complete genome sequences are available.  相似文献   

3.

Background  

It has been suggested previously that genome and proteome sequences show characteristics typical of natural-language texts such as "signature-style" word usage indicative of authors or topics, and that the algorithms originally developed for natural language processing may therefore be applied to genome sequences to draw biologically relevant conclusions. Following this approach of 'biological language modeling', statistical n-gram analysis has been applied for comparative analysis of whole proteome sequences of 44 organisms. It has been shown that a few particular amino acid n-grams are found in abundance in one organism but occurring very rarely in other organisms, thereby serving as genome signatures. At that time proteomes of only 44 organisms were available, thereby limiting the generalization of this hypothesis. Today nearly 1,000 genome sequences and corresponding translated sequences are available, making it feasible to test the existence of biological language models over the evolutionary tree.  相似文献   

4.

Background  

Single nucleotide polymorphisms (SNPs) are important tools in studying complex genetic traits and genome evolution. Computational strategies for SNP discovery make use of the large number of sequences present in public databases (in most cases as expressed sequence tags (ESTs)) and are considered to be faster and more cost-effective than experimental procedures. A major challenge in computational SNP discovery is distinguishing allelic variation from sequence variation between paralogous sequences, in addition to recognizing sequencing errors. For the majority of the public EST sequences, trace or quality files are lacking which makes detection of reliable SNPs even more difficult because it has to rely on sequence comparisons only.  相似文献   

5.

Background  

Representing symbolic sequences graphically using iterated maps has enjoyed an enduring popularity since it was first proposed in Jeffrey 1990 as chaos game representation (CGR). The usefulness of this representation goes beyond the convenience of a scale independent representation, it provides a variable memory length representation of transition. This includes the representation of succession with non-integer order, which comes with the promise of generalizing Markovian formalisms. The original proposal targeted genomic sequences only but since then several generalizations have been proposed, many specifically designed to handle protein data.  相似文献   

6.

Background  

Genome comparisons have made possible the reconstruction of the eutherian ancestral karyotype but also have the potential to provide new insights into the evolutionary inter-relationship of the different eutherian orders within the mammalian phylogenetic tree. Such comparisons can additionally reveal (i) the nature of the DNA sequences present within the evolutionary breakpoint regions and (ii) whether or not the evolutionary breakpoints occur randomly across the genome. Gene synteny analysis (E-painting) not only greatly reduces the complexity of comparative genome sequence analysis but also extends its evolutionary reach.  相似文献   

7.

Background  

Due to recent advances in whole genome shotgun sequencing and assembly technologies, the financial cost of decoding an organism's DNA has been drastically reduced, resulting in a recent explosion of genomic sequencing projects. This increase in related genomic data will allow for in depth studies of evolution in closely related species through multiple whole genome comparisons.  相似文献   

8.
9.
10.

Background  

Perception of sugars is an invaluable ability for insects which often derive quickly accessible energy from these molecules. A distinctive subfamily of eight proteins within the gustatory receptor (Gr) family has been identified as sugar receptors (SRs) in Drosophila melanogaster (Gr5a, Gr61a, and Gr64a-f). We examined the evolution of these SRs within the 12 available Drosophila genome sequences, as well as three mosquito, two moth, and beetle, bee, and wasp genome sequences.  相似文献   

11.

Background  

With the growing availability of entire genome sequences, an increasing number of scientists can exploit oligonucleotide microarrays for genome-scale expression studies. While probe-design is a major research area, relatively little work has been reported on the optimization of microarray protocols.  相似文献   

12.

Background  

The timescale of prokaryote evolution has been difficult to reconstruct because of a limited fossil record and complexities associated with molecular clocks and deep divergences. However, the relatively large number of genome sequences currently available has provided a better opportunity to control for potential biases such as horizontal gene transfer and rate differences among lineages. We assembled a data set of sequences from 32 proteins (~7600 amino acids) common to 72 species and estimated phylogenetic relationships and divergence times with a local clock method.  相似文献   

13.
14.

Background  

Transposable elements (TEs) are abundant genomic sequences that have been found to contribute to genome evolution in unexpected ways. Here, we characterize the evolutionary and functional characteristics of TE-derived human genome regulatory sequences uncovered by the high throughput mapping of DNaseI-hypersensitive (HS) sites.  相似文献   

15.

Background  

Analysis of any newly sequenced bacterial genome starts with the identification of protein-coding genes. Despite the accumulation of multiple complete genome sequences, which provide useful comparisons with close relatives among other organisms during the annotation process, accurate gene prediction remains quite difficult. A major reason for this situation is that genes are tightly packed in prokaryotes, resulting in frequent overlap. Thus, detection of translation initiation sites and/or selection of the correct coding regions remain difficult unless appropriate biological knowledge (about the structure of a gene) is imbedded in the approach.  相似文献   

16.

Background  

Gene loss, inversions, translocations, and other chromosomal rearrangements vary among species, resulting in different rates of structural genome evolution. Major chromosomal rearrangements are rare in most eukaryotes, giving large regions with the same genes in the same order and orientation across species. These regions of macrosynteny have been very useful for locating homologous genes in different species and to guide the assembly of genome sequences. Previous analyses in the fungi have indicated that macrosynteny is rare; instead, comparisons across species show no synteny or only microsyntenic regions encompassing usually five or fewer genes. To test the hypothesis that chromosomal evolution is different in the fungi compared to other eukaryotes, synteny was compared between species of the major fungal taxa.  相似文献   

17.

Background  

It has been shown for an evolutionarily distant genomic comparison that the number of protein-protein interactions a protein has correlates negatively with their rates of evolution. However, the generality of this observation has recently been challenged. Here we examine the problem using protein-protein interaction data from the yeast Saccharomyces cerevisiae and genome sequences from two other yeast species.  相似文献   

18.

Background  

Ribosomal proteins (RPs) are key components of ribosomes, the cellular organelle responsible for protein biosynthesis in cells. Their levels can vary as a function of organism growth and development; however, some RPs have been associated with other cellular processes or extraribosomal functions. Their high representation in cDNA libraries has resulted in the increase of RP sequences available from different organisms and their proposal as appropriate molecular markers for phylogenetic analysis.  相似文献   

19.

Background  

The phylogenetic distribution of large-scale genome structure (i.e. mosaic compositional patchiness) has been explored mainly by analytical ultracentrifugation of bulk DNA. However, with the availability of large, good-quality chromosome sequences, and the recently developed computational methods to directly analyze patchiness on the genome sequence, an evolutionary comparative analysis can be carried out at the sequence level.  相似文献   

20.

Background  

The rapid completion of genome sequences has created an infrastructure of biological information and provided essential information to link genes to gene products, proteins, the building blocks for cellular functions. In addition, genome/cDNA sequences make it possible to predict proteins for which there is no experimental evidence. Clues for function of hypothetical proteins are provided by sequence similarity with proteins of known function in model organisms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号