首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Roundup: a multi-genome repository of orthologs and evolutionary distances   总被引:1,自引:0,他引:1  
SUMMARY: We have created a tool for ortholog and phylogenetic profile retrieval called Roundup. Roundup is backed by a massive repository of orthologs and associated evolutionary distances that was built using the reciprocal smallest distance algorithm, an approach that has been shown to improve upon alternative approaches of ortholog detection, such as reciprocal blast. Presently, the Roundup repository contains all possible pair-wise comparisons for over 250 genomes, including 32 Eukaryotes, more than doubling the coverage of any similar resource. The orthologs are accessible through an intuitive web interface that allows searches by genome or gene identifier, presenting results as phylogenetic profiles together with gene and molecular function annotations. Results may be downloaded as phylogenetic matrices for subsequent analysis, including the construction of whole-genome phylogenies based on gene-content data. AVAILABILITY: http://rodeo.med.harvard.edu/tools/roundup.  相似文献   

2.
SUMMARY: Phylogenetic Web Profiler (PWP) is a web-based service designed to perform phylogenetic profiling of proteins against genomes. The current version offers a selection of 63 completed genomes and available plasmids as annotated in the PEDANT genome database. Unlike currently available applications, this tool offers several choices of ortholog prediction parameters including E-value cutoff, percent length difference tolerance, and annotation similarity. Additional features include tight integration with the PEDANT database and tools to analyze properties of predicted proteins. PWP should prove very useful for the analysis of functional-linkage between proteins.  相似文献   

3.
To investigate molecular epidemiology of DuCV in Cherry Valley ducks in China,the complete genomes of six DuCV strains,which were detected from Cherry Valley ducks in China between 2007 and 2008,were s...  相似文献   

4.
Shi G  Peng MC  Jiang T 《PloS one》2011,6(6):e20892
The identification of orthologous genes shared by multiple genomes plays an important role in evolutionary studies and gene functional analyses. Based on a recently developed accurate tool, called MSOAR 2.0, for ortholog assignment between a pair of closely related genomes based on genome rearrangement, we present a new system MultiMSOAR 2.0, to identify ortholog groups among multiple genomes in this paper. In the system, we construct gene families for all the genomes using sequence similarity search and clustering, run MSOAR 2.0 for all pairs of genomes to obtain the pairwise orthology relationship, and partition each gene family into a set of disjoint sets of orthologous genes (called super ortholog groups or SOGs) such that each SOG contains at most one gene from each genome. For each such SOG, we label the leaves of the species tree using 1 or 0 to indicate if the SOG contains a gene from the corresponding species or not. The resulting tree is called a tree of ortholog groups (or TOGs). We then label the internal nodes of each TOG based on the parsimony principle and some biological constraints. Ortholog groups are finally identified from each fully labeled TOG. In comparison with a popular tool MultiParanoid on simulated data, MultiMSOAR 2.0 shows significantly higher prediction accuracy. It also outperforms MultiParanoid, the Roundup multi-ortholog repository and the Ensembl ortholog database in real data experiments using gene symbols as a validation tool. In addition to ortholog group identification, MultiMSOAR 2.0 also provides information about gene births, duplications and losses in evolution, which may be of independent biological interest. Our experiments on simulated data demonstrate that MultiMSOAR 2.0 is able to infer these evolutionary events much more accurately than a well-known software tool Notung. The software MultiMSOAR 2.0 is available to the public for free.  相似文献   

5.
Seven barley species have been compared for organization of repeated sequences. Quantitative variation of repeated DNA fractions is demonstrated, though the total amount of sequences (reassociation up to Cot=10) in most cases does not vary. The repeats are divided into four groups by the mode of interspecific variability, with the help of dot and blot hybridization of the genomes under study with cloned highly repeated sequences of Hordeum vulgare. The first group contains the pHv7161 family of the most conservative sequences. The second group comprises moderately changing repeats. The third group includes highly variable Hind III repeats of Hordeum genomes, and the fourth group is represented by pHv7191 family of repeats that are highly amplified in H. vulgare genome. Comparative analysis of content and organization of highly repeated sequences in genome helps to clarify phylogenetic relationships in the genus and can be used for prediction of successfullness of interspecific hybridization.  相似文献   

6.
Our understanding of the evolutionary history of primates is undergoing continual revision due to ongoing genome sequencing efforts. Bolstered by growing fossil evidence, these data have led to increased acceptance of once controversial hypotheses regarding phylogenetic relationships, hybridization and introgression, and the biogeographical history of primate groups. Among these findings is a pattern of recent introgression between species within all major primate groups examined to date, though little is known about introgression deeper in time. To address this and other phylogenetic questions, here, we present new reference genome assemblies for 3 Old World monkey (OWM) species: Colobus angolensis ssp. palliatus (the black and white colobus), Macaca nemestrina (southern pig-tailed macaque), and Mandrillus leucophaeus (the drill). We combine these data with 23 additional primate genomes to estimate both the species tree and individual gene trees using thousands of loci. While our species tree is largely consistent with previous phylogenetic hypotheses, the gene trees reveal high levels of genealogical discordance associated with multiple primate radiations. We use strongly asymmetric patterns of gene tree discordance around specific branches to identify multiple instances of introgression between ancestral primate lineages. In addition, we exploit recent fossil evidence to perform fossil-calibrated molecular dating analyses across the tree. Taken together, our genome-wide data help to resolve multiple contentious sets of relationships among primates, while also providing insight into the biological processes and technical artifacts that led to the disagreements in the first place.

Combining three newly sequenced primate genomes with other published genomes, this study adapts a little-known method for detecting ancient introgression to genome-scale data, revealing multiple previously unknown examples of hybridization between primate species.  相似文献   

7.
We present a method for automatically extracting groups of orthologous genes from a large set of genomes by a new clustering algorithm on a weighted multipartite graph. The method assigns a score to an arbitrary subset of genes from multiple genomes to assess the orthologous relationships between genes in the subset. This score is computed using sequence similarities between the member genes and the phylogenetic relationship between the corresponding genomes. An ortholog cluster is found as the subset with the highest score, so ortholog clustering is formulated as a combinatorial optimization problem. The algorithm for finding an ortholog cluster runs in time O(|E| + |V| log |V|), where V and E are the sets of vertices and edges, respectively, in the graph. However, if we discretize the similarity scores into a constant number of bins, the runtime improves to O(|E| + |V|). The proposed method was applied to seven complete eukaryote genomes on which the manually curated database of eukaryotic ortholog clusters, KOG, is constructed. A comparison of our results with the manually curated ortholog clusters shows that our clusters are well correlated with the existing clusters  相似文献   

8.
We have determined the complete chloroplast genome sequences of four early-diverging lineages of angiosperms, Buxus (Buxaceae), Chloranthus (Chloranthaceae), Dioscorea (Dioscoreaceae), and Illicium (Schisandraceae), to examine the organization and evolution of plastid genomes and to estimate phylogenetic relationships among angiosperms. For the most part, the organization of these plastid genomes is quite similar to the ancestral angiosperm plastid genome with a few notable exceptions. Dioscorea has lost one protein-coding gene, rps16; this gene loss has also happened independently in four other land plant lineages, liverworts, conifers, Populus, and legumes. There has also been a small expansion of the inverted repeat (IR) in Dioscorea that has duplicated trnH-GUG. This event has also occurred multiple times in angiosperms, including in monocots, and in the two basal angiosperms Nuphar and Drimys. The Illicium chloroplast genome is unusual by having a 10 kb contraction of the IR. The four taxa sequenced represent key groups in resolving phylogenetic relationships among angiosperms. Illicium is one of the basal angiosperms in the Austrobaileyales, Chloranthus (Chloranthales) remains unplaced in angiosperm classifications, and Buxus and Dioscorea are early-diverging eudicots and monocots, respectively. We have used sequences for 61 shared protein-coding genes from these four genomes and combined them with sequences from 35 other genomes to estimate phylogenetic relationships using parsimony, likelihood, and Bayesian methods. There is strong congruence among the trees generated by the three methods, and most nodes have high levels of support. The results indicate that Amborella alone is sister to the remaining angiosperms; the Nymphaeales represent the next-diverging clade followed by Illicium; Chloranthus is sister to the magnoliids and together this group is sister to a large clade that includes eudicots and monocots; and Dioscorea represents an early-diverging lineage of monocots just internal to Acorus.  相似文献   

9.
Retroids in archaea: phylogeny and lateral origins   总被引:3,自引:0,他引:3  
  相似文献   

10.
Phylogenomics is aimed at studying functional and evolutionary aspects of genome biology using phylogenetic analysis of whole genomes. Current approaches to genome phylogenies are commonly founded in terms of phylogenetic trees. However, several evolutionary processes are non tree-like in nature, including recombination and lateral gene transfer (LGT). Phylogenomic networks are a special type of phylogenetic network reconstructed from fully sequenced genomes. The network model, comprising genomes connected by pairwise evolutionary relations, enables the reconstruction of both vertical and LGT events. Modeling genome evolution in the form of a network enables the use of an extensive toolbox developed for network research. The structural properties of phylogenomic networks open up fundamentally new insights into genome evolution.  相似文献   

11.
The complete genome sequences of two Sulfolobus spindle-shaped viruses (SSVs) from acidic hot springs in Kamchatka (Russia) and Yellowstone National Park (United States) have been determined. These nonlytic temperate viruses were isolated from hyperthermophilic Sulfolobus hosts, and both viruses share the spindle-shaped morphology characteristic of the Fuselloviridae family. These two genomes, in combination with the previously determined SSV1 genome from Japan and the SSV2 genome from Iceland, have allowed us to carry out a phylogenetic comparison of these geographically distributed hyperthermal viruses. Each virus contains a circular double-stranded DNA genome of approximately 15 kbp with approximately 34 open reading frames (ORFs). These Fusellovirus ORFs show little or no similarity to genes in the public databases. In contrast, 18 ORFs are common to all four isolates and may represent the minimal gene set defining this viral group. In general, ORFs on one half of the genome are colinear and highly conserved, while ORFs on the other half are not. One shared ORF among all four genomes is an integrase of the tyrosine recombinase family. All four viral genomes integrate into their host tRNA genes. The specific tRNA gene used for integration varies, and one genome integrates into multiple loci. Several unique ORFs are found in the genome of each isolate.  相似文献   

12.
MTTEs (Miniature inverted-repeat transposabie elements) are reminiscence ot non-autonomous DNA (class Ⅱ) elements, which are distinguished from other transposable elements by their small size, short terminal inverted repeats (TIRs), high copy numbers, genie preference, and DNA sequence identity among family members. Although MITEs were first discovered in plants and still actively reshaping genomes, they have been isolated from a wide range of eukaryotic organisms. MITEs can be divided into Tourist-like, Stowaway-like, and pogo-like groups, according to similarities of their TIRs and TSDs (target site duplications). In despite of several models to explain the origin and amplification of MITEs, their mechanisms of transposition and accumulation in eukaryotic genomes remain poorly understood owing to insufficient experimental data. The unique properties of MITEs have been exploited as useful genetic tools for plant genome analysis. Utilization of MITEs as effective and informative genomic markers and pot  相似文献   

13.
Metazoa-level universal single-copy orthologs (mzl-USCOs) are universally applicable markers for DNA taxonomy in animals that can replace or supplement single-gene barcodes. Previously, mzl-USCOs from target enrichment data were shown to reliably distinguish species. Here, we tested whether USCOs are an evenly distributed, representative sample of a given metazoan genome and therefore able to cope with past hybridization events and incomplete lineage sorting. This is relevant for coalescent-based species delimitation approaches, which critically depend on the assumption that the investigated loci do not exhibit autocorrelation due to physical linkage. Based on 239 chromosome-level assembled genomes, we confirmed that mzl-USCOs are genetically unlinked for practical purposes and a representative sample of a genome in terms of reciprocal distances between USCOs on a chromosome and of distribution across chromosomes. We tested the suitability of mzl-USCOs extracted from genomes for species delimitation and phylogeny in four case studies: Anopheles mosquitos, Drosophila fruit flies, Heliconius butterflies and Darwin's finches. In almost all instances, USCOs allowed delineating species and yielded phylogenies that corresponded to those generated from whole genome data. Our phylogenetic analyses demonstrate that USCOs may complement single-gene DNA barcodes and provide more accurate taxonomic inferences. Combining USCOs from sources that used different versions of ortholog reference libraries to infer marker orthology may be challenging and, at times, impact taxonomic conclusions. However, we expect this problem to become less severe as the rapidly growing number of reference genomes provides a better representation of the number and diversity of organismal lineages.  相似文献   

14.
Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element–like and long interspersed element–like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.  相似文献   

15.
Genomic diversity and past population histories are key considerations in the fields of conservation and evolutionary biology. In this issue of Molecular Ecology Resources, Prasad et al. (Mol. Ecol. Resour., 2021) examine how the quality and phylogenetic divergence of reference genomes influences the outcomes of downstream analyses such as diversity and demographic history inference. Using the beluga whale and rowi kiwi as examples (Figure 1), they systematically estimate heterozygosity, runs of homozygosity (ROH), and demographic history (PSMC) using reference genomes of varying quality and phylogenetic divergence from the target species. They show that demographic history analyses are impacted by phylogenetic distance, although this is not pronounced until divergence exceeds 3% from the target species. Similarly, their results imply that heterozygosity estimates are dependent on phylogenetic distance and the method used to perform the estimates, and ROHs are potentially undetectable when a nonconspecific reference is used. This investigation into the role of divergence and quality of reference genomes highlights the impact and potential biases generated by genome selection on downstream analyses, and provides a possible alternative in cross-species scaffolding in instances where a conspecific reference genome is not available.  相似文献   

16.
The Wnts          下载免费PDF全文

Background

The eukaryotic ubiquitin-conjugation system sets the turnover rate of many proteins and includes activating enzymes (E1s), conjugating enzymes (UBCs/E2s), and ubiquitin-protein ligases (E3s), which are responsible for activation, covalent attachment and substrate recognition, respectively. There are also ubiquitin-like proteins with distinct functions, which require their own E1s and E2s for attachment. We describe the results of RNA interference (RNAi) experiments on the E1s, UBC/E2s and ubiquitin-like proteins in Caenorhabditis elegans. We also present a phylogenetic analysis of UBCs.

Results

The C. elegans genome encodes 20 UBCs and three ubiquitin E2 variant proteins. RNAi shows that only four UBCs are essential for embryogenesis: LET-70 (UBC-2), a functional homolog of yeast Ubc4/5p, UBC-9, an ortholog of yeast Ubc9p, which transfers the ubiquitin-like modifier SUMO, UBC-12, an ortholog of yeast Ubc12p, which transfers the ubiquitin-like modifier Rub1/Nedd8, and UBC-14, an ortholog of Drosophila Courtless. RNAi of ubc-20, an ortholog of yeast UBC1, results in a low frequency of arrested larval development. A phylogenetic analysis of C. elegans, Drosophila and human UBCs shows that this protein family can be divided into 18 groups, 13 of which include members from all three species. The activating enzymes and the ubiquitin-like proteins NED-8 and SUMO are required for embryogenesis.

Conclusions

The number of UBC genes appears to increase with developmental complexity, and our results suggest functional overlap in many of these enzymes. The ubiquitin-like proteins NED-8 and SUMO and their corresponding activating enzymes are required for embryogenesis.  相似文献   

17.
18.
Phylogenomics reveal a robust fungal tree of life   总被引:3,自引:0,他引:3  
Our understanding of the tree of life (TOL) is still fragmentary. Until recently, molecular phylogeneticists have built trees based on ribosomal RNA sequences and selected protein sequences, which, however, usually suffered from lack of support for the deeper branches and inconsistencies probably due to limited subsampling of the entire genome. Now, phylogenetic hypotheses can be based on the analysis of full genomes. We used available complete genome data as well as the eukaryote orthologous group (KOG) proteins to reconstruct with confidence basal branches of the fungal TOL. Phylogenetic analysis of a core of 531 KOGs shared among 21 fungal genomes, three animal genomes and one plant genome showed a single tree with high support resulting from four different methods of phylogenetic reconstruction. The single tree that we inferred from our dataset showed excellent nodal support for each branch, suggesting that it reflects the true phylogenetic relationships of the species involved.  相似文献   

19.
Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element–like and long interspersed element–like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.  相似文献   

20.
The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics, since many computational methods for solving various biological problems critically rely on bona fide orthologs as input. While it is usually done using sequence similarity search, we recently proposed a new combinatorial approach that combines sequence similarity and genome rearrangement. This paper continues the development of the approach and unites genome rearrangement events and (post-speciation) duplication events in a single framework under the parsimony principle. In this framework, orthologous genes are assumed to correspond to each other in the most parsimonious evolutionary scenario involving both genome rearrangement and (post-speciation) gene duplication. Besides several original algorithmic contributions, the enhanced method allows for the detection of inparalogs. Following this approach, we have implemented a high-throughput system for ortholog assignment on a genome scale, called MSOAR, and applied it to human and mouse genomes. As the result will show, MSOAR is able to find 99 more true orthologs than the INPARANOID program did. In comparison to the iterated exemplar algorithm on simulated data, MSOAR performed favorably in terms of assignment accuracy. We also validated our predicted main ortholog pairs between human and mouse using public ortholog assignment datasets, synteny information, and gene function classification. These test results indicate that our approach is very promising for genome-wide ortholog assignment. Supplemental material and MSOAR program are available at http://msoar.cs.ucr.edu.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号