共查询到20条相似文献,搜索用时 31 毫秒
1.
Background
The reconstruction of ancestral genomes must deal with the problem of resolution, necessarily involving a trade-off between trying to identify genomic details and being overwhelmed by noise at higher resolutions.Results
We use the median reconstruction at the synteny block level, of the ancestral genome of the order Gentianales, based on coffee, Rhazya stricta and grape, to exemplify the effects of resolution (granularity) on comparative genomic analyses.Conclusions
We show how decreased resolution blurs the differences between evolving genomes, with respect to rate, mutational process and other characteristics.2.
Peter A. Larsen R. Alan Harris Yue Liu Shwetha C. Murali C. Ryan Campbell Adam D. Brown Beth A. Sullivan Jennifer Shelton Susan J. Brown Muthuswamy Raveendran Olga Dudchenko Ido Machol Neva C. Durand Muhammad S. Shamim Erez Lieberman Aiden Donna M. Muzny Richard A. Gibbs Anne D. Yoder Jeffrey Rogers Kim C. Worley 《BMC biology》2017,15(1):110
3.
Background
Miniature inverted-repeat transposable element (MITE) is a type of class II non-autonomous transposable element playing a crucial role in the process of evolution in biology. There is an urgent need to develop bioinformatics tools to effectively identify MITEs on a whole genome-wide scale. However, most of currently existing tools suffer from low ability to deal with large eukaryotic genomes.Methods
In this paper, we proposed a novel tool MiteFinderII, which was adapted from our previous algorithm MiteFinder, to efficiently detect MITEs from genomics sequences. It has six major steps: (1) build K-mer Index and search for inverted repeats; (2) filtration of inverted repeats with low complexity; (3) merger of inverted repeats; (4) filtration of candidates with low score; (5) selection of final MITE sequences; (6) selection of representative sequences.Results
To test the performance, MiteFinderII and three other existing algorithms were applied to identify MITEs on the whole genome of oryza sativa. Results suggest that MiteFinderII outperforms existing popular tools in terms of both specificity and recall. Additionally, it is much faster and more memory-efficient than other tools in the detection.Conclusion
MiteFinderII is an accurate and effective tool to detect MITEs hidden in eukaryotic genomes. The source code is freely accessible at the website: https://github.com/screamer/miteFinder.4.
Background
An increasing number of microbial genomes are being sequenced and deposited in public databases. In addition, several closely related strains are also being sequenced in order to understand the genetic basis of diversity and mechanisms that lead to the acquisition of new genetic traits. These exercises have necessitated the requirement for visualizing microbial genomes and performing genome comparisons on a finer scale. We have developed GenomeViz to enable rapid visualization and subsequent comparisons of several microbial genomes in an interactive environment.Results
Here we describe a program that allows visualization of both qualitative and quantitative information from complete and partially sequenced microbial genomes. Using GenomeViz, data deriving from studies on genomic islands, gene/protein classifications, GC content, GC skew, whole genome alignments, microarrays and proteomics may be plotted. Several genomes can be visualized interactively at the same time from a comparative genomic perspective and publication quality circular genome plots can be created.Conclusions
GenomeViz should allow researchers to perform visualization and comparative analysis of up to eight different microbial genomes simultaneously.5.
Yeong C Kim Yong-Chul Jung Jun Chen Ali H Alhasan Parawee Kaewsaard Yanming Zhang Shuo Ma Steve Rosen San Ming Wang 《BMC research notes》2010,3(1):341
Background
Chronic lymphocytic leukemia (CLL) is the most common adult leukemia in the western population. Although genetic factors are considered to contribute to CLL etiology, at present genomic aberrations identified in CLL are limited compared with those identified in other types of leukemia, which raises the question of the degree of genetic influence on CLL. We performed a high-resolution genome scanning study to address this issue.Findings
Using the restriction paired-end-based Ditag Genome Scanning technique, we analyzed three primary CLL samples at a kilobase resolution, and further validated the results in eight primary CLL samples including the two used for ditag collection. From 51,632 paired-end tags commonly detected in the three CLL samples representing 5% of the HindIII restriction fragments in the genomes, we identified 230 paired-end tags that were present in all three CLL genomes but not in multiple normal human genome reference sequences. Mapping the full-length sequences of the fragments detected by these unmapped tags in seven additional CLL samples confirmed that these are the genomic aberrations caused by small insertions and deletions, and base changes spreading across coding and non-coding regions.Conclusions
Our study identified hundreds of loci with insertion, deletion, base change, and restriction site polymorphism present in both coding and non-coding regions in CLL genomes, indicating the wide presence of small genomic aberrations in chronic lymphocytic leukemia. Our study supports the use of a whole genome sequencing approach for comprehensively decoding the CLL genome for better understanding of the genetic defects in CLL.6.
Ramya Raviram Pedro P. Rocha Vincent M. Luo Emily Swanzey Emily R. Miraldi Cédric Feschotte Richard Bonneau Jane A. Skok 《Genome biology》2018,19(1):216
Background
The organization of chromatin in the nucleus plays an essential role in gene regulation. About half of the mammalian genome comprises transposable elements. Given their repetitive nature, reads associated with these elements are generally discarded or randomly distributed among elements of the same type in genome-wide analyses. Thus, it is challenging to identify the activities and properties of individual transposons. As a result, we only have a partial understanding of how transposons contribute to chromatin folding and how they impact gene regulation.Results
Using PCR and Capture-based chromosome conformation capture (3C) approaches, collectively called 4Tran, we take advantage of the repetitive nature of transposons to capture interactions from multiple copies of endogenous retrovirus (ERVs) in the human and mouse genomes. With 4Tran-PCR, reads are selectively mapped to unique regions in the genome. This enables the identification of transposable element interaction profiles for individual ERV families and integration events specific to particular genomes. With this approach, we demonstrate that transposons engage in long-range intra-chromosomal interactions guided by the separation of chromosomes into A and B compartments as well as topologically associated domains (TADs). In contrast to 4Tran-PCR, Capture-4Tran can uniquely identify both ends of an interaction that involve retroviral repeat sequences, providing a powerful tool for uncovering the individual transposable element insertions that interact with and potentially regulate target genes.Conclusions
4Tran provides new insight into the manner in which transposons contribute to chromosome architecture and identifies target genes that transposable elements can potentially control.7.
Background
Chicken anemia virus (CAV) is the causative agent of chicken infectious anemia. CAV putative intergenotypic recombinants have been reported previously. This fact is based on the previous classification of CAV sequences into three genotypes. However, it is unknown whether intersubtype recombination occurs between the recently reported four CAV genotypes and five subtypes of genome sequences.Results
Phylogenetic analysis, together with a variety of computational recombination detection algorithms, was used to investigate CAV approximately full genomes. Statistically significant evidence of intersubtype recombination was detected in the parent-like and two putative CAV recombinant sequences. This event was shown to occur between CAV subgroup A1 and A2 sequences in the phylogenetic trees.Conclusions
We revealed that intersubtype recombination in CAV genome sequences played a role in generating genetic diversity within the natural population of CAV.8.
Background
Species of Paris Sect. Marmorata are valuable medicinal plants to synthesize steroidal saponins with effective pharmacological therapy. However, the wild resources of the species are threatened by plundering exploitation before the molecular genetics studies uncover the genomes and evolutionary significance. Thus, the availability of complete chloroplast genome sequences of Sect. Marmorata is necessary and crucial to the understanding the plastome evolution of this section and facilitating future population genetics studies. Here, we determined chloroplast genomes of Sect. Marmorata, and conducted the whole chloroplast genome comparison.Results
This study presented detailed sequences and structural variations of chloroplast genomes of Sect. Marmorata. Over 40 large repeats and approximately 130 simple sequence repeats as well as a group of genomic hotspots were detected. Inverted repeat contraction of this section was inferred via comparing the chloroplast genomes with the one of P. verticillata. Additionally, almost all the plastid protein coding genes were found to prefer ending with A/U. Mutation bias and selection pressure predominately shaped the codon bias of most genes. And most of the genes underwent purifying selection, whereas photosynthetic genes experienced a relatively relaxed purifying selection.Conclusions
Repeat sequences and hotspot regions can be scanned to detect the intraspecific and interspecific variability, and selected to infer the phylogenetic relationships of Sect. Marmorata and other species in subgenus Daiswa. Mutation and natural selection were the main forces to drive the codon bias pattern of most plastid protein coding genes. Therefore, this study enhances the understanding about evolution of Sect. Marmorata from the chloroplast genome, and provide genomic insights into genetic analyses of Sect. Marmorata.9.
Background
Bacterial genomes develop new mechanisms to tide them over the imposing conditions they encounter during the course of their evolution. Acquisition of new genes by lateral gene transfer may be one of the dominant ways of adaptation in bacterial genome evolution. Lateral gene transfer provides the bacterial genome with a new set of genes that help it to explore and adapt to new ecological niches.Methods
A maximum likelihood analysis was done on the five sequenced corynebacterial genomes to model the rates of gene insertions/deletions at various depths of the phylogeny.Results
The study shows that most of the laterally acquired genes are transient and the inferred rates of gene movement are higher on the external branches of the phylogeny and decrease as the phylogenetic depth increases. The newly acquired genes are under relaxed selection and evolve faster than their older counterparts. Analysis of some of the functionally characterised LGTs in each species has indicated that they may have a possible adaptive role.Conclusion
The five Corynebacterial genomes sequenced to date have evolved by acquiring between 8 – 14% of their genomes by LGT and some of these genes may have a role in adaptation.10.
Jensen LJ Skovgaard M Sicheritz-Pontén T Jørgensen MK Lundegaard C Pedersen CC Petersen N Ussery D 《BMC genomics》2003,4(1):12
Background
For most sequenced prokaryotic genomes, about a third of the protein coding genes annotated are "orphan proteins", that is, they lack homology to known proteins. These hypothetical genes are typically short and randomly scattered throughout the genome. This trend is seen for most of the bacterial and archaeal genomes published to date.Results
In contrast we have found that a large fraction of the genes coding for such orphan proteins in the Methanopyrus kandleri AV19 genome occur within two large regions. These genes have no known homologs except from other M. kandleri genes. However, analysis of their lengths, codon usage, and Ribosomal Binding Site (RBS) sequences shows that they are most likely true protein coding genes and not random open reading frames.Conclusions
Although these regions can be considered as candidates for massive lateral gene transfer, our bioinformatics analysis suggests that this is not the case. We predict many of the organism specific proteins to be transmembrane and belong to protein families that are non-randomly distributed between the regions. Consistent with this, we suggest that the two regions are most likely unrelated, and that they may be integrated plasmids.11.
12.
Background
With the publication of the draft chicken genome and the recent production of several BAC clone libraries from non-avian reptiles and birds, it is now possible to undertake more detailed comparative genomic studies in Reptilia. Of interest in particular are the genomic events that transformed the large, repeat-rich genomes of mammals and non-avian reptiles into the minimalist chicken genome. We have used paired BAC end sequences (BESs) from the American alligator (Alligator mississippiensis), painted turtle (Chrysemys picta) and emu (Dromaius novaehollandiae) to investigate patterns of sequence divergence, gene and retroelement content, and microsynteny between these species and chicken.Results
From a total of 11,967 curated BESs, we successfully mapped 725, 773 and 2597 sequences in alligator, turtle, and emu, respectively, to sites in the draft chicken genome using a stringent BLAST protocol. Most commonly, sequences mapped to a single site in the chicken genome. Of 1675, 1828 and 2936 paired BESs obtained for alligator, turtle, and emu, respectively, a total of 34 (alligator, 2%), 24 (turtle, 1.3%) and 479 (emu, 16.3%) pairs were found to map with high confidence and in the correct orientation and with BAC-sized intermarker distances to single chicken chromosomes, including 25 such paired hits in emu mapping to the chicken Z chromosome. By determining the insert sizes of a subset of BAC clones from these three species, we also found a significant correlation between the intermarker distance in alligator and turtle and in chicken, with slopes as expected on the basis of the ratio of the genome sizes.Conclusion
Our results suggest that a large number of small-scale chromosomal rearrangements and deletions in the lineage leading to chicken have drastically reduced the number of detected syntenies observed between the chicken and alligator, turtle, and emu genomes and imply that small deletions occurring widely throughout the genomes of reptilian and avian ancestors led to the ~50% reduction in genome size observed in birds compared to reptiles. We have also mapped and identified likely gene regions in hundreds of new BAC clones from these species.13.
Adriana Muñoz Chunfang Zheng Qian Zhu Victor A Albert Steve Rounsley David Sankoff 《BMC bioinformatics》2010,11(1):304
Background
There has been a trend in increasing the phylogenetic scope of genome sequencing without finishing the sequence of the genome. Increasing numbers of genomes are being published in scaffold or contig form. Rearrangement algorithms, however, including gene order-based phylogenetic tools, require whole genome data on gene order or syntenic block order. How then can we use rearrangement algorithms to compare genomes available in scaffold form only? Can the comparative evidence predict the location of unsequenced genes?Results
Our method involves optimally filling in genes missing from the scaffolds, while incorporating the augmented scaffolds directly into the rearrangement algorithms as if they were chromosomes. This is accomplished by an exact, polynomial-time algorithm. We then correct for the number of extra fusion/fission operations required to make scaffolds comparable to full assemblies. We model the relationship between the ratio of missing genes actually absent from the genome versus merely unsequenced ones, on one hand, and the increase of genomic distance after scaffold filling, on the other. We estimate the parameters of this model through simulations and by comparing the angiosperm genomes Ricinus communis and Vitis vinifera.Conclusions
The algorithm solves the comparison of genomes with 18,300 genes, including 4500 missing from one genome, in less than a minute on a MacBook, putting virtually all genomes within range of the method.14.
15.
Evgenii A. Konorov Mikhail A. Nikitin Kirill V Mikhailov Sergey N. Lysenkov Mikhail Belenky Peter L. Chang Sergey V. Nuzhdin Victoria A. Scobeyeva 《BMC evolutionary biology》2017,17(1):39
Background
The world is rapidly urbanizing, and only a subset of species are able to succeed in stressful city environments. Efficient genome-enabled stress response appears to be a likely prerequisite for urban adaptation. Despite the important role ants play in the ecosytem, only the genomes of ~13 have been sequenced so far. Here, we present the draft genome assembly of the black garden ant Lasius niger – the most successful urban inhabitant of all ants – and we compare it with the genomes of other ant species, including the closely related Camponotus floridanus.Results
Sequences from 272 M Illumina reads were assembled into 41,406 contigs with total length of 245 MB, and N50 of 16,382 bp, similar to other ant genome assemblies enabling comparative genomic analysis. Remarkably, the predicted proteome of L. niger is significantly enriched relative to other ant genomes in terms of abundance of domains involved in nucleic acid binding, DNA repair, and nucleotidyl transferase activity, reflecting transposable element proliferation and a likely genomic response. With respect to environmental stress, we note a proliferation of various detoxification genes, including glutatione-S-transferases and those in the cytochrome P450 families. Notably, the CYP9 family is highly expanded with 19 complete and 21 nearly complete members - over twice as many compared to other ants. This family exhibits the signatures of strong directional selection, with eleven positively selected positions in ligand-binding pockets of enzymes. Gene family contraction was detected for several components of the olfactory system, accompanied by instances of both directional selection and relaxation.Conclusions
Our results suggest that the success of L. niger in urbanized areas may be the result of fortuitous coincidence of several factors, including the expansion of the CYP9 cytochrome family due to coevolution with parasitic fungi, the diversification of DNA repair systems as an answer to proliferation of retroelements, and the reduction of olfactory system and behavioral preadaptations from non-territorial subdominant life strategies found in natural environments. Diversification of cytochromes and DNA repair systems along with reduced odorant communication are the basis of L. niger pollutant resistance and polyphagy, while non-territorial and mobilization strategies allows more efficient exploitation of large but patchy food sources.16.
Fabian Gärtner Christian Höner zu Siederdissen Lydia Müller Peter F. Stadler 《Algorithms for molecular biology : AMB》2018,13(1):15
Background
Genome sequences and genome annotation data have become available at ever increasing rates in response to the rapid progress in sequencing technologies. As a consequence the demand for methods supporting comparative, evolutionary analysis is also growing. In particular, efficient tools to visualize-omics data simultaneously for multiple species are sorely lacking. A first and crucial step in this direction is the construction of a common coordinate system. Since genomes not only differ by rearrangements but also by large insertions, deletions, and duplications, the use of a single reference genome is insufficient, in particular when the number of species becomes large.Results
The computational problem then becomes to determine an order and orientations of optimal local alignments that are as co-linear as possible with all the genome sequences. We first review the most prominent approaches to model the problem formally and then proceed to showing that it can be phrased as a particular variant of the Betweenness Problem. It is NP hard in general. As exact solutions are beyond reach for the problem sizes of practical interest, we introduce a collection of heuristic simplifiers to resolve ordering conflicts.Conclusion
Benchmarks on real-life data ranging from bacterial to fly genomes demonstrate the feasibility of computing good common coordinate systems.17.
18.
Background
The pufferfish Fugu rubripes (Fugu) with its compact genome is increasingly recognized as an important vertebrate model for comparative genomic studies. In particular, large regions of conserved synteny between human and Fugu genomes indicate its utility to identify disease-causing genes. The human chromosome 12p12 is frequently deleted in various hematological malignancies and solid tumors, but the actual tumor suppressor gene remains unidentified.Results
We investigated approximately 200 kb of the genomic region surrounding the ETV6 locus in Fugu (fETV6) in order to find conserved functional features, such as genes or regulatory regions, that could give insight into the nature of the genes targeted by deletions in human cancer cells. Seven genes were identified near the fETV6 locus. We found that the synteny with human chromosome 12 was conserved, but extensive genomic rearrangements occurred between the Fugu and human ETV6 loci.Conclusion
This comparative analysis led to the identification of previously uncharacterized genes in the human genome and some potentially important regulatory sequences as well. This is a good indication that the analysis of the compact Fugu genome will be valuable to identify functional features that have been conserved throughout the evolution of vertebrates.19.
Background
Precise identification of three-dimensional genome organization, especially enhancer-promoter interactions (EPIs), is important to deciphering gene regulation, cell differentiation and disease mechanisms. Currently, it is a challenging task to distinguish true interactions from other nearby non-interacting ones since the power of traditional experimental methods is limited due to low resolution or low throughput.Results
We propose a novel computational framework EP2vec to assay three-dimensional genomic interactions. We first extract sequence embedding features, defined as fixed-length vector representations learned from variable-length sequences using an unsupervised deep learning method in natural language processing. Then, we train a classifier to predict EPIs using the learned representations in supervised way. Experimental results demonstrate that EP2vec obtains F1 scores ranging from 0.841~?0.933 on different datasets, which outperforms existing methods. We prove the robustness of sequence embedding features by carrying out sensitivity analysis. Besides, we identify motifs that represent cell line-specific information through analysis of the learned sequence embedding features by adopting attention mechanism. Last, we show that even superior performance with F1 scores 0.889~?0.940 can be achieved by combining sequence embedding features and experimental features.Conclusions
EP2vec sheds light on feature extraction for DNA sequences of arbitrary lengths and provides a powerful approach for EPIs identification.20.
Andres Benavides Juan Pablo Isaza Juan Pablo Niño-García Juan Fernando Alzate Felipe Cabarcas 《BMC genomics》2018,19(8):858