首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The orientation of closely linked genes in mammalian genomes is not random: there are more head-to-head (h2h) gene pairs than expected. To understand the origin of this enrichment in h2h gene pairs, we have analyzed the phylogenetic distribution of gene pairs separated by less than 600 bp of intergenic DNA (gene duos). We show here that a lack of head-to-tail (h2t) gene duos is an even more distinctive characteristic of mammalian genomes, with the platypus genome as the only exception. In nonmammalian vertebrate and in nonvertebrate genomes, the frequency of h2h, h2t, and tail-to-tail (t2t) gene duos is close to random. In tetrapod genomes, the h2t and t2t gene duos are more likely to be part of a larger gene cluster of closely spaced genes than h2h gene duos; in fish and urochordate genomes, the reverse is seen. In human and mouse tissues, the expression profiles of gene duos were skewed toward positive coexpression, irrespective of orientation. The organization of orthologs of both members of about 40% of the human gene duos could be traced in other species, enabling a prediction of the organization at the branch points of gnathostomes, tetrapods, amniotes, and euarchontoglires. The accumulation of h2h gene duos started in tetrapods, whereas that of h2t and t2t gene duos only started in amniotes. The apparent lack of evolutionary conservation of h2t and t2t gene duos relative to that of h2h gene duos is thus a result of their relatively late origin in the lineage leading to mammals; we show that once they are formed h2t and t2t gene duos are as stable as h2h gene duos.  相似文献   

2.
Two new short retroposon families (SINEs) have been found in the genome of springhare Pedetes capensis (Rodentia). One of them, Ped-1, originated from 5S rRNA, while the other one, Ped-2, originated from tRNA-derived SINE ID. In contrast to most currently active mammalian SINEs mobilized by L1 long retrotransposon (LINE), Ped-1 and Ped-2 are mobilized by Bov-B, a LINE family of the widely distributed RTE clade. The 3' part of these SINEs originates from two sequences in the 5' and 3' regions of Bov-B. Such bipartite structure of the LINE-derived part has been revealed in all Bov-B-mobilized SINEs known to date (AfroSINE, Bov-tA, Mar-1, and Ped-1/2), which distinguishes them from other SINEs with only a 3' LINE-derived part. Structural analysis and the distribution of Bov-B LINEs and partner SINEs supports the horizontal transfer of Bov-B, while the SINEs emerged independently in lineages with this LINE.  相似文献   

3.
CpG islands in vertebrate genomes   总被引:120,自引:0,他引:120  
  相似文献   

4.
5.
Frenkel S  Kirzhner V  Korol A 《PloS one》2012,7(2):e32076
Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.  相似文献   

6.
The comparison of genomic sequences is now a common approach to identifying and characterizing functional regions in vertebrate genomes. However, for theoretical reasons and because of practical issues, the generation of these data sets is non-trivial and can have many pitfalls. We are currently seeing an explosion of comparative sequence data, the benefits and limitations of which need to be disseminated to the scientific community. This Review provides a critical overview of the different types of sequence data that are available for analysis and of contemporary comparative sequence analysis methods, highlighting both their strengths and limitations. Approaches to determining the biological significance of constrained sequence are also explored.  相似文献   

7.
8.
Detection of functional modules from protein interaction networks   总被引:4,自引:0,他引:4  
  相似文献   

9.
The compositional evolution of vertebrate genomes   总被引:7,自引:0,他引:7  
Bernardi G 《Gene》2000,259(1-2):31-43
The compositional evolution of vertebrate genomes is characterized: (i) by one predominant conservative mode, in which nucleotide changes occur, but the base composition of DNA sequences in general, and of coding sequences in particular, does not change; and (ii) by three different shifting or transitional modes, in which nucleotide changes are accompanied by changes in the base composition of sequences. Investigations on these evolutionary modes have shed new light on a central problem in molecular evolution, namely the role played by natural selection in modulating the mutational input.This review will present first the intragenomic shifts, the 'major shifts' and the 'minor shift', and then the 'whole-genome', or 'horizontal', shift. In each case, the shifts were preceded and followed by a conservative mode of evolution. This review expands on a previous one [Bernardi, Gene 241 (2000) 3-17], and summarizes the evidence that the changes of the compositional patterns of the genome and their maintenance are controlled by Darwinian natural selection.  相似文献   

10.
We present ParaDB (http://abi.marseille.inserm.fr/paradb/), a new database for large-scale paralogy studies in vertebrate genomes. We intended to collect all information (sequence, mapping and phylogenetic data) needed to map and detect new paralogous regions, previously defined as Paralogons. The AceDB database software was used to generate graphical objects and to organize data. General data were automatically collated from public sources (Ensembl, GadFly and RefSeq). ParaDB provides access to data derived from whole genome sequences (Homo sapiens, Mus musculus and Drosophila melanogaster): cDNA and protein sequences, positional information, bibliographical links. In addition, we provide BLAST results for each protein sequence, InParanoid orthologs and 'In-Paralogs' data, previously established paralogy data, and, to compare vertebrates and Drosophila, orthology data.  相似文献   

11.
In spite of the importance of point mutations for evolution and human diseases, their natural spectrum of incidence in different species is not known. Here I propose to determine these spectra by comparing consecutive sequence periods in stretches of repetitive DNA. The article presents the analysis of more than 51,000 such point mutations identified by this approach in the genomes of human, chimpanzee, rat, mouse, pufferfish, zebrafish, and sea squirt. I propose to explain the observed spectra by auto‐mutagenic mechanisms of genome variation involving the inter‐conversions of nucleotides, single base‐pair inversions and their combinations.  相似文献   

12.
We recently developed a method for producing comprehensive gene and species phylogenies from unaligned whole genome data using singular value decomposition (SVD) to analyze character string frequencies. This work provides an integrated gene and species phylogeny for 64 vertebrate mitochondrial genomes composed of 832 total proteins. In addition, to provide a theoretical basis for the method, we present a graphical interpretation of both the original frequency matrix and the SVD-derived matrix. These large matrices describe high-dimensional Euclidean spaces within which biomolecular sequences can be uniquely represented as vectors. In particular, the SVD-derived vector space describes each protein relative to a restricted set of newly defined, independent axes, each of which represents a novel form of conserved motif, termed a correlated peptide motif. A quantitative comparison of the relative orientations of protein vectors in this space provides accurate and straightforward estimates of sequence similarity, which can in turn be used to produce comprehensive gene trees. Alternatively, the vector representations of genes from individual species can be summed, allowing species trees to be produced.  相似文献   

13.
Evolution of N-terminal sequences of the vertebrate HOXA13 protein   总被引:8,自引:0,他引:8  
While the the role of the homeodomain in HOX function has been evaluated extensively, little attention has been given to the non-homeodomain portions of the HOX proteins. To investigate the evolution of the HOXA13 protein and to identify conserved residues in the N-terminal region of the protein with potential functional significance, N-terminal Hoxa13 coding sequences were PCR-amplified from fish, amphibian, reptile, chicken, and marsupial and eutherian mammal genomic DNA. Compared with fish HOXA13, the mammalian protein has increased in size by 35% primarily owing to the accumulation of alanine repeats and flanking segments rich in proline, glycine, or serine within the first 215 amino acids. Certain residues and amino acid motifs were strongly conserved, and several HOXA13 N-terminal domains were also shared in the paralogous HOXB13 and HOXD13 genes; however, other conserved regions appear to be unique to HOXA13. Two domains highly conserved in HOXA13 orthologs are shared with Drosophila AbdB and other vertebrate AbdB-like proteins. Marsupial and eutherian mammalian HOXA13 proteins have three large homopolymeric alanine repeats of 14, 12, and 17–18 residues that are absent in reptiles, birds, and fish. Thus, the repeats arose after the divergence of reptiles from the lineage that would give rise to the mammals. In contrast, other short homopolymeric alanine repeats in mammalian HOXA13 have remained virtually the same length, suggesting that forces driving or limiting repeat expansion are context dependent. Consecutive stretches of identical third-base usage in alanine codons within the large repeats were found, supporting replication slippage as a mechanism for their generation. However, numerous species-specific base substitutions affecting third-base alanine repeat codon positions were observed, particularly in the largest repeat. Therefore, if the large alanine repeats were present prior to eutherian mammal development as is suggested by the opossum data, then a dynamic process of recurring replication slippage and point mutation within alanine repeat codons must be considered to reconcile these observations. This model might also explain why the alanine repeats are flanked by proline, serine, and glycine-rich sequences, and it reveals a biological mechanism that promotes increases in protein size and, potentially, acquisition of new functions. Received: 8 June 1999 / Accepted: 23 September 1999  相似文献   

14.
15.
16.

Background  

Molecular networks represent the backbone of molecular activity within cells and provide opportunities for understanding the mechanism of diseases. While protein-protein interaction data constitute static network maps, integration of condition-specific co-expression information provides clues to the dynamic features of these networks. Dilated cardiomyopathy is a leading cause of heart failure. Although previous studies have identified putative biomarkers or therapeutic targets for heart failure, the underlying molecular mechanism of dilated cardiomyopathy remains unclear.  相似文献   

17.
Recent advances in high throughput experiments and annotations via published literature have provided a wealth of interaction maps of several biomolecular networks, including metabolic, protein-protein, and protein-DNA interaction networks. The architecture of these molecular networks reveals important principles of cellular organization and molecular functions. Analyzing such networks, i.e., discovering dense regions in the network, is an important way to identify protein complexes and functional modules. This task has been formulated as the problem of finding heavy subgraphs, the heaviest k-subgraph problem (k-HSP), which itself is NP-hard. However, any method based on the k-HSP requires the parameter k and an exact solution of k-HSP may still end up as a "spurious" heavy subgraph, thus reducing its practicability in analyzing large scale biological networks. We proposed a new formulation, called the rank-HSP, and two dynamical systems to approximate its results. In addition, a novel metric, called the standard deviation and mean ratio (SMR), is proposed for use in "spurious" heavy subgraphs to automate the discovery by setting a fixed threshold. Empirical results on both the simulated graphs and biological networks have demonstrated the efficiency and effectiveness of our proposal  相似文献   

18.
T Boehm 《Current biology : CB》2012,22(17):R722-R732
All multicellular organisms protect themselves against pathogens using sophisticated immune defenses. Functionally interconnected humoral and cellular facilities maintain immune homeostasis in the absence of overt infection and regulate the initiation and termination of immune responses directed against pathogens. Immune responses of invertebrates, such as flies, are innate and usually stereotyped; those of vertebrates, encompassing species as diverse as jawless fish and humans, are additionally adaptive, enabling more rapid and efficient immune reactivity upon repeated encounters with a pathogen. Many of the attributes historically defining innate and adaptive immunity are in fact common to both, blurring their functional distinction and emphasizing shared ancestry and co-evolution. These findings provide indications of the evolutionary forces underlying the origin of somatic diversification of antigen receptors and contribute to our understanding of the complex phenotypes of human immune disorders. Moreover, informed by phylogenetic considerations and inspired by improved knowledge of functional networks, new avenues emerge for innovative therapeutic strategies.  相似文献   

19.

Background  

Genome and metagenome studies have identified thousands of protein families whose functions are poorly understood and for which techniques for functional characterization provide only partial information. For such proteins, the genome context can give further information about their functional context.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号