首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Shannon’s information theoretic perspective of communication helps one to understand the storage and processing of information in one-dimensional sequences. An information theoretic analysis of 937 available completely sequenced prokaryotic genomes and 238 eukaryotic chromosomes is presented. Information content (Id) values were used to cluster these chromosomes. Chargaff’s second parity rule i.e compositional self-complementarity, an empirical fact is observed in all the genomes, except for the proteobacteria Candidatus Hodgkinia cicadicola. High information content, arising out of biased base composition in all the 14 chromosomes of Plasmodium falciparum is found among two other genomes of prokaryotes viz. Buchnera aphidicola str. Cc (Cinara cedri) and Candidatus Carsonella ruddii PV. Despite size and compositional variations, both prokaryotic and eukaryotic genomes do not deviate significantly from an equiprobable and random situation. Eukaryotic chromosomes of an organism tend to have similar informational restraints as seen when a simple distance based method is used to cluster them. In eukaryotes, in certain cases, Id values are also similar for the two arms (p and q arm) of the chromosomes. The results of this current study confirm that the information content can provide insights into the clustering of genomes and the evolution of messaging strategies of the genomes. An efficient and robust Perl CGI standalone tool is created based on this information theory algorithm for the analysis of the whole genomes and is made available at https://github.com/AlagurajVeluchamy/InformationTheory.  相似文献   

2.
The Cambrian explosion is a grand challenge to science today and involves multidisciplinary study. This event is generally believed as a result of genetic innovations, environmental factors and ecological interactions, even though there are many conflicts on nature and timing of metazoan origins. The crux of the matter is that an entire roadmap of the evolution is missing to discern the biological complexity transition and to evaluate the critical role of the Cambrian explosion in the overall evolutionary context. Here, we calculate the time of the Cambrian explosion by a “C-value clock”; our result quite fits the fossil records. We clarify that the intrinsic reason of genome evolution determined the Cambrian explosion. A general formula for evaluating genome size of different species has been found, by which the genome size evolution can be illustrated. The Cambrian explosion, as a major transition of biological complexity, essentially corresponds to a critical turning point in genome size evolution.  相似文献   

3.
The mustard family, Brassicaceae, is well-known for its homoplasy in almost any morphological character at practically all taxonomic levels. The genus Arabis, within the largest tribe of the Brassicaceae, is such an example comprising numerous para- and polyphyletic groups of taxa. Past research during the last 15 years has unraveled many phylogenetic relationships among the ∼550 (or more) species within the notoriously difficult tribe Arabideae. The European Arabis hirsuta species aggregate has remained unexplored, however. Herein we analyze phylogenetic relationships using nuclear ITS and plastid DNA sequences of Eurasian Arabis to characterize Hairy rock cress (A. hirsuta) and its relatives. Representative geographic sampling is used to study character and trait evolution, and bioclimatic data are used to differentiate between species. Our overview puts European Arabis into a reliable evolutionary framework, and we provide some striking insights into evolutionary trends and correlating morphological characters from seeds and flowers with environmental data such as climate variables and elevation. We demonstrate independent parallel evolution of sets of traits, and, therefore, we could further elaborate our previous findings that within tribe Arabideae high speciation rates are correlated with perennial growth form and occurrence at higher elevation. Finally some taxonomical remarks are provided to give added context.  相似文献   

4.
The identification of gastrointestinal helminth infections of humans and livestock almost exclusively relies on the detection of eggs or larvae in faeces, followed by manual counting and morphological characterisation to differentiate species using microscopy-based techniques. However, molecular approaches based on the detection and quantification of parasite DNA are becoming more prevalent, increasing the sensitivity, specificity and throughput of diagnostic assays. High-throughput sequencing, from single PCR targets through to the analysis of whole genomes, offers significant promise towards providing information-rich data that may add value beyond traditional and conventional molecular approaches; however, thus far, its utility has not been fully explored to detect helminths in faecal samples. In this study, low-depth whole genome sequencing, i.e. genome skimming, has been applied to detect and characterise helminth diversity in a set of helminth-infected human and livestock faecal material. The strengths and limitations of this approach are evaluated using three methods to characterise and differentiate metagenomic sequencing data based on (i) mapping to whole mitochondrial genomes, (ii) whole genome assemblies, and (iii) a comprehensive internal transcribed spacer 2 (ITS2) database, together with validation using quantitative PCR (qPCR). Our analyses suggest that genome skimming can successfully identify most single and multi-species infections reported by qPCR and can provide sufficient coverage within some samples to resolve consensus mitochondrial genomes, thus facilitating phylogenetic analyses of selected genera, e.g. Ascaris spp. Key to this approach is both the availability and integrity of helminth reference genomes, some of which are currently contaminated with bacterial and host sequences. The success of genome skimming of faecal DNA is dependent on the availability of vouchered sequences of helminths spanning both taxonomic and geographic diversity, together with methods to detect or amplify minute quantities of parasite nucleic acids in mixed samples.  相似文献   

5.
基因组序列k-mer的非随机使用规律及包含的生物学意义一直是人们关注的问题,目前还没有根本性进展。本文以七个物种的全部基因序列为样本,得到各物种基因组序列的8-mer频谱分布。发现狗和牛的频谱有三个峰,而斑马鱼、青鳉鱼、秀丽线虫和酿酒酵母的频谱只有一个峰,鸡的频谱分布形状介于两者之间。将8-mer集合按照XY二核苷含量分类,结果显示只有CG二核苷分类下0CG、1CG和2CG三类子集的频谱形成各自独立的单峰分布。对照随机序列,发现0CG模体是随机进化的,1CG和2CG模体是定向进化的,它们的使用频次远小于随机频次,且这种独立进化分离规律具有物种普适性。三个CG子集频谱之间的距离是产生单峰或多峰现象的根本原因。将七个物种基因组序列标准化到109bp,比较发现1CG和2CG子集频谱与物种进化显著相关,0CG子集频谱与物种进化无显著关系。可以认为三种CG模体各自执行着不同的生物学功能。基因组序列8-mer的独立分离规律为揭示基因组结构、基因组进化以及模体的生物功能提供了一种新的思维方式。  相似文献   

6.
Pie MR  Torres RA  Brito DM 《Genetica》2007,131(1):51-58
Despite remarkable advances in genomic studies over the past few decades, surprisingly little is known about the processes governing genome evolution at macroevolutionary timescales. In a seminal paper, Hinegardner and Rosen (Am Nat 106:621–644, 1972) suggested that taxa characterized by larger genomes should also display disproportionately stronger fluctuations in genome size. Therefore, according to the Hinegardner and Rosen (HR) hypothesis, there should be a negative correlation between average within-family genome size and its corresponding coefficient of variation (CV), a prediction that was supported by their analysis of the genomes of 275 species of fish. In this study we reevaluate the HR hypothesis using an expanded dataset (2050 genome size records). Moreover, in addition to the use of standard linear regression techniques, we also conducted modern comparative analyses that take into account phylogenetic non-independence. Our analyses failed to confirm the negative relationship detected in the original study, suggesting that the evolution of genome size in fishes might be more complex than envisioned by the HR hypothesis. Interestingly, the frequency distribution of fish genome sizes was strongly skewed, even on a logarithmic scale, suggesting that the dynamics underlying genome size evolution are driven by multiplicative phenomena, which might include chromosomal rearrangements and the expansion of transposable elements.  相似文献   

7.
Summary Fifty random clones (350–2300 bp), derived from sheared, nuclear DNA, were studied via Southern analysis in order to make deductions about the organization and evolution of the tomato genome. Thirty-four of the clones were mapped genetically and determined to represent points on 11 of the 12 tomato chromosomes. Under moderate stringency conditions (80% homology required) 44% of the clones were classified as single copy. Under higher stringency, the majority of the clones (78%) behaved as single copy. Most of the remaining clones belonged to multicopy families containing 2–20 copies, while a few contained moderately or highly repeated sequences (10% at moderate stringency, 4% at high stringency). Divergence rates of sequences homologous to the 50 random genomic clones were compared with those corresponding to 20 previously described cDNA (coding sequence) clones. Rates were measured by probing each clone (random genomics and cDNAs) onto filters containing DNA from various species from the family Solanaceae (including potato, Datura, petunia and tobacco) as well as one species (watermelon) from another plant family, Cucurbitaceae. Under moderate stringency conditions, the majority of the random clones (single copy and repetitive) failed to detect homologous sequences in the more distantly related species, whereas approximately 90% of the 20 coding sequences analyzed could still be detected in all solanaceous species. The most highly repeated sequences appear to be the fastest evolving and homologous copies could be detected only in species most closely related to tomato. Dispersion of repetitive sequences, as opposed to tandem clustering, appears to be the rule for the tomato genome. None of the repetitive sequences discovered by this random sampling of the genome were tandemly arranged — a finding consistent with the notion that the tomato genome contains only a small fraction of satellite DNA. This study, along with a companion paper (Ganal et al. 1988), provides the first general sketch of the tomato genome at the molecular level and indicates that it is comprised largely of single copy sequences and these sequences, together with repetitive sequences are evolving at a rate faster than the coding portion of the genome. The small genome and paucity of highly repetitive DNA are favourable attributes with respect to the possibilities of conducting chromosome walking experiments in tomato and the fact that coding regions are well conserved among solanaceous species may be useful for distinguishing clones that contain coding regions from those that do not.  相似文献   

8.
Macqueen DJ  Johnston IA 《FEBS letters》2006,580(21):4996-5002
A novel myoD paralogue was characterised in Salmo salar (smyoD1c) and S. trutta (btmyoD1c). SmyoD1c had 78.2/90.6% protein sequence identity to smyoD1a/smyoD1b, respectively. Each paralogue was differentially expressed throughout somitogenesis. In adult fish, smyoD1a was the predominant gene expressed in fast muscle, whereas smyoD1c was 2-3 times upregulated in slow muscle compared to smyoD1a/1b. A maximum likelihood analysis indicated that myoD1c arose by duplication of myoD1b after the salmonid tetraploidization. Another myoD paralogue (myoD2) is present in at least some teleosts, reflecting a more ancient genome duplication. To accommodate these findings we propose a simplified teleost-myoD nomenclature.  相似文献   

9.
The integration-based genome database provides useful information through a user-friendly web interface that allows analysis of comparative genome for agricultural plants. We have concentrated on the functional bioinformatics of major agricultural resources, such as rice, Chinese cabbage, rice mutant lines, and microorganisms. The major functions are focused on functional genome analysis, including genome projects, gene expression analysis, gene markers with genetic map, analysis tools for comparative genome structure, and genome annotation in agricultural plants.

Availability

The database is available for free at http://nabic.naas.go.kr/  相似文献   

10.
A hAT superfamily transposase recruited by the cereal grass genome   总被引:1,自引:0,他引:1  
Transposable elements are ubiquitous genomic parasites with an ancient history of coexistence with their hosts. A few cases have emerged recently where these genetic elements have been recruited for normal function in the host organism. We have identified an expressed hobo/Ac/Tam (hAT) family transposase-like gene in cereal grasses which appears to represent such a case. This gene, which we have called gary, is found in one or two copies in barley, two diverged copies in rice and two very similar copies in hexaploid wheat. No gary homologues are found in Arabidopsis. In all three cereal species, an apparently complete 2.5 kb transposase-like open reading frame is present and nucleotide substitution data show evidence for positive selection, yet the predicted gary protein is probably not an active transposase, as judged by the absence of key amino acids required for transposase function. Gary is expressed in wheat and barley spikes and gary cDNA sequences are also found in rice, oat, rye, maize, sorghum and sugarcane. The short inverted terminal repeats, flanked by an eight-nucleotide host sequence duplication, which are characteristic of a hAT transposon are absent. Genetic mapping in barley shows that gary is located on the distal end of the long arm of chromosome 2H. Wheat homologues of gary map to the same approximate location on the wheat group 2 chromosomes by physical bin-mapping and the more closely related of the two rice garys maps to the syntenic location near the bottom of rice chromosome 4. These data suggest that gary has resided in a single genomic location for at least 60 Myr and has lost the ability to transpose, yet expresses a transposase-related protein that is being conserved under host selection. We propose that the gary transposase-like gene has been recruited by the cereal grasses for an unknown function.Electronic Supplementary Material Supplementary material is available for this article at and is accessible for authorized users.  相似文献   

11.
The contribution of slippage-like processes to genome evolution   总被引:19,自引:0,他引:19  
Simple sequences present in long (>30 kb) sequences representative of the single-copy genome of five species (Homo sapiens, Caenorhabditis elegans Saccharomyces cerevisiae, E. coli, and Mycobacterium leprae) have been analyzed. A close relationship was observed between genome size and the overall level of sequence repetition. This suggested that the incorporation of simple sequences had accompanied increases of genome size during evolution. Densities of simple sequence motifs were higher in noncoding regions than in coding regions in eukaryotes but not in eubacteria. All five genomes showed very biased frequency distributions of simple sequence motifs in all species, particularly in eukaryotes where AAA and TTT predominated. Interspecific comparisons showed that noncoding sequences in eukaryotes showed highly significantly similar frequency distributions of simple sequence motifs but this was not true of coding sequences. ANOVA of the frequency distributions of simple sequence motifs indicated strong contributions from motif base composition and repeat unit length, but much of the variation remained unexplained by these parameters. The sequence composition of simple sequences therefore appears to reflect both underlying sequence biases in slippage-like processes and the action of selection. Frequency distributions of simple sequence motifs in coding sequences correlated weakly or not at all with those in noncoding sequences. Selection on coding sequences to eliminate undesirable sequences may therefore have been strong, particularly in the human lineage.  相似文献   

12.
We used complete sequence data from 30 complete Herpesviridae genomes to investigate phylogenetic relationships and patterns of genome evolution. The approach was to identify orthologous gene clusters among taxa and to generate a genomic matrix of gene content. We identified 17 genes with homologs in all 30 taxa and concatenated a subset of 10 of these genes for phylogenetic inference. We also constructed phylogenetic trees on the basis of gene content data. The amino acid and gene content phylogenies were largely concordant, but the amino acid data had much higher internal support. We mapped gene gain events onto the phylogenetic tree by assuming that genes were gained only once during the evolution of herpesviruses. Thirty genes were inferred to be present in the ancestor of all herpesvirus, a number smaller than previously hypothesized. Few genes of recent origin within herpesviruses could be identified as originating from transfer between virus and vertebrate hosts. Inferred rates of gene gain were heterogeneous, with both taxonomic and temporal biases. Nonetheless, the average rate of gene gain was approximately 3.5 x 10(-7) genes gained per year, which is an order of magnitude higher than the nucleotide mutation rate for these large DNA viruses.  相似文献   

13.
Phylogenetic trees based on gene repertoires are remarkably similar to the current consensus of life history. Yet it has been argued that shared gene content is unreliable for phylogenetic reconstruction because of convergence in gene content due to horizontal gene transfer and parallel gene loss. Here we test this argument, by filtering out as noise those orthologous groups that have an inconsistent phylogenetic distribution, using two independent methods. The resulting phylogenies do indeed contain small but significant improvements. More importantly, we find that the majority of orthologous groups contain some phylogenetic signal and that the resulting phylogeny is the only detectable signal present in the gene distribution across genomes. Horizontal gene transfer or parallel gene loss does not cause systematic biases in the gene content tree.  相似文献   

14.
Microbial genome sequences provide us with the fossil records for inferring their origination and evolution. Assuming that current microbial genomes are the evolutionary results of ancient genomes or fragments and the neighboring genes in ancient genomes are more likely neighbors in current genomes, in this paper we proposed a paleontological algorithm and assembled the orthologous gene groups from 66 complete and current microbial genome sequences into a pseudo-ancient genome, which consists of continuous fragments of various sizes. We performed bootstrap resampling and correlation analyses and the results showed that the assembled ancient genome and fragments are statistically significant and the genes of the same fragment are inherently related and likely derived from common ancestors. This method provides a new computational tool for studying microbial genome structure and evolution.  相似文献   

15.
While our understanding of gene-based biology has greatly improved, it is clear that the function of the genome and most diseases cannot be fully explained by genes and other regulatory elements. Genes and the genome represent distinct levels of genetic organization with their own coding systems; Genes code parts like protein and RNA, but the genome codes the structure of genetic networks, which are defined by the whole set of genes, chromosomes and their topological interactions within a cell. Accordingly, the genetic code of DNA offers limited understanding of genome functions. In this perspective, we introduce the genome theory which calls for the departure of gene-centric genomic research. To make this transition for the next phase of genomic research, it is essential to acknowledge the importance of new genome-based biological concepts and to establish new technology platforms to decode the genome beyond sequencing.  相似文献   

16.
《遗传学报》2022,49(2):120-131
Melastomataceae has abundant morphological diversity with high economic and ornamental merit in Myrtales. The phylogenetic position of Myrtales is still contested. Here, we report the chromosome-level genome assembly of Melastoma dodecandrum in Melastomataceae. The assembled genome size is 299.81 Mb with a contig N50 value of 3.00 Mb. Genome evolution analysis indicated that M. dodecandrum, Eucalyptus grandis, and Punica granatum were clustered into a clade of Myrtales and formed a sister group with the ancestor of fabids and malvids. We found that M. dodecandrum experienced four whole-genome polyploidization events: the ancient event was shared with most eudicots, one event was shared with Myrtales, and the other two events were unique to M. dodecandrum. Moreover, we identified MADS-box genes and found that the AP1-like genes expanded, and AP3-like genes might have undergone subfunctionalization. The SUAR63-like genes and AG-like genes showed different expression patterns in stamens, which may be associated with heteranthery. In addition, we found that LAZY1-like genes were involved in the negative regulation of stem branching development, which may be related to its creeping features. Our study sheds new light on the evolution of Melastomataceae and Myrtales, which provides a comprehensive genetic resource for future research.  相似文献   

17.
以密码对使用偏好性和密码对中二核苷酸频率分别构建了系统发育树。发现用40种模式生物编码序列中密码对的二核苷酸频率构建的系统发育树,明显将生物按进化分成细菌,古菌,真核生物;用密码对使用偏好性指标构建的系统发育树与基于密码对中二核苷酸频率的系统发育树基本一致。结果表明密码对中二核苷酸组分是密码对偏好的决定因素之一。  相似文献   

18.
19.
A theory of an early stage of genome evolution by combinatorial fusion of circular DNA units is suggested, based on protein sequence fossil evidence. The evidence includes preference of protein sequence lengths for certain sizes—multiples of 123 as for eukaryotes and multiples of 152 as for prokaryotes. At the DNA level these sizes correspond to 350–450 base pairs—the known optimal range for DNA ring closure. The methionine residues repeatedly appear along the sequences with the same period of about 120 as (in eukaryotes), presumably marking the sites of insertion of the early genes—rings of protein-coding DNA. No torsional constraint in this DNA results in very sharp estimate of the helical periodicity of the early DNA, indistinguishable from the experimental mean value for extant DNA. According to the combinatorial fusion theory, based on the above evidence, in the pregenomic, prerecombinational stage the genes and the noncoding sequences existed in form of autonomously replicating DNA rings of close to standard size, randomly segregating between dividing cells, like modern plasmids do. In the recombinational early genomic stage the rings started to fuse, forming larger DNA molecules consisting of several unit genes connected in various combinations and forming long protein-coding sequences (combinatorial fusion). This process, which involved, perhaps, noncoding sequences as well, eventually resulted in the formation of large genomes. The dispersed circular DNA—or, rather, evolutionarily advanced derivatives thereof—may still exist in the form of various mobile DNA elements.  相似文献   

20.
Molecular characterization of a cloned dolphin mitochondrial genome   总被引:11,自引:0,他引:11  
Summary DNA clones have been isolated that span the complete mitochondrial (mt) genome of the dolphin,Cephalorhynchus commersonii. Hybridization experiments with purified primate mtDNA probes have established that there is close resemblance in the general organization of the dolphin mt genome and the terrestrial mammalian mt genomes. Sequences covering 2381 bp of the dolphin mt genome from the major noncoding region, three tRNA genes, and parts of the genes encoding cytochrome b, NADH dehydrogenase subunit 3 (ND3), and 16S rRNA have been compared with corresponding regions from other mammalian genomes. There is a general tendency throughout the sequenced regions for greater similarity between dolphin and bovine mt genomes than between dolphin and rodent or human mt genomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号