首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The rapid accumulation of complete genomic sequences offers the opportunity to carry out an analysis of inter- and intra-individual genome variation within a species on a routine basis. Sequencing whole genomes requires resources that are currently beyond those of a single laboratory and therefore it is not a practical approach for resequencing hundreds of individual genomes. DNA microarrays present an alternative way to study differences between closely related genomes. Advances in microarray-based approaches have enabled the main forms of genomic variation (amplifications, deletions, insertions, rearrangements and base-pair changes) to be detected using techniques that are readily performed in individual laboratories using simple experimental approaches.  相似文献   

2.
3.
4.
Human-disease etiology can be better understood with phase information about diploid sequences. We present a method for estimating haplotypes, using genotype data from unrelated samples or small nuclear families, that leads to improved accuracy and speed compared to several widely used methods. The method, segmented haplotype estimation and imputation tool (SHAPEIT), scales linearly with the number of haplotypes used in each iteration and can be run efficiently on whole chromosomes.  相似文献   

5.
The bacterium Deinococcus radiodurans is one of the most resistant organisms to ionizing radiation and other DNAdamaging agents. Although, at present, 30 Deinococcus species have been identified, the whole-genome sequences of most species remain unknown, with the exception of D. radiodurans (DRD), D. geothermalis, and D. deserti. In this study, comparative genomic hybridization (CGH) microarray analysis of three Deinococcus species, D. radiopugnans (DRP), D. proteolyticus (DPL), and D. radiophilus (DRPH), was performed using oligonucleotide arrays based on DRD. Approximately 28%, 14%, and 15% of 3,128 open reading frames (ORFs) of DRD were absent in the genomes of DRP, DPL, and DRPH, respectively. In addition, 162 DRD ORFs were absent in all three species. The absence of 17 randomly selected ORFs was confirmed by a Southern blot. Functional classification showed that the absent genes spanned a variety of functional categories: some genes involved in amino acid biosynthesis, cell envelope, cellular processes, central intermediary metabolism, and DNA metabolism were not present in any of the three deinococcal species tested. Finally, comparative genomic data showed that 120 genes were Deinococcus-specific, not the 230 reported previously. Specifically, ddrD, ddrO, and ddrH genes, previously identified as Deinococcus-specific, were not present in DRP, DPL, or DRPH, suggesting that only a portion of ddr genes are shared by all members of the genus Deinococcus.  相似文献   

6.
7.
It has been postulated that existing species have been linked in the past in a way that can be described using an additive tree structure. Any such tree structure reflecting species relationships is associated with a matrix of distances between the species considered which is called a distance matrix or a tree metric matrix. A circular order of elements of X corresponds to a circular (clockwise) scanning of the subset X of vertices of a tree drawn on a plane. This paper describes an optimal algorithm using circular orders to compare the topology of two trees given by their distance matrices. This algorithm allows us to compute the Robinson and Foulds topologic distance between two trees. It employs circular order tree reconstruction to compute an ordered bipartition table of the tree edges for both given distance matrices. These bipartition tables are then compared to determine the Robinson and Foulds topologic distance, known to be an important criterion of tree similarity. The described algorithm has optimal time complexity, requiring O(n(2)) time when performed on two n x n distance matrices. It can be generalized to get another optimal algorithm, which enables the strict consensus tree of k unrooted trees, given their distance matrices, to be constructed in O(kn(2)) time.  相似文献   

8.
Whole-genome sequences are now available for many microbial species and clades, however existing whole-genome alignment methods are limited in their ability to perform sequence comparisons of multiple sequences simultaneously. Here we present the Harvest suite of core-genome alignment and visualization tools for the rapid and simultaneous analysis of thousands of intraspecific microbial strains. Harvest includes Parsnp, a fast core-genome multi-aligner, and Gingr, a dynamic visual platform. Together they provide interactive core-genome alignments, variant calls, recombination detection, and phylogenetic trees. Using simulated and real data we demonstrate that our approach exhibits unrivaled speed while maintaining the accuracy of existing methods. The Harvest suite is open-source and freely available from: http://github.com/marbl/harvest.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0524-x) contains supplementary material, which is available to authorized users.  相似文献   

9.
BackgroundThe three-dimensional organization of the genome is tightly connected to its biological function. The Hi-C approach was recently introduced as a method that can be used to identify higher-order chromatin interactions genome-wide. The aim of this study was to determine genome-wide chromatin interaction frequencies using the Hi-C approach in mouse sperm cells and embryonic fibroblasts.ResultsThe obtained data demonstrate that the three-dimensional genome organizations of sperm and fibroblast cells show a high degree of similarity both with each other and with the previously described mouse embryonic stem cells. Both A- and B-compartments and topologically associated domains are present in spermatozoa and fibroblasts. Nevertheless, sperm cells and fibroblasts exhibit statistically significant differences between each other in the contact probabilities of defined loci. Tight packaging of the sperm genome results in an enrichment of long-range contacts compared with the fibroblasts. However, only 30% of the differences in the number of contacts are based on differences in the densities of their genome packages; the main source of the differences is the gain or loss of contacts that are specific for defined genome regions. We find that the dependence of the contact probability on genomic distance for sperm is close to the dependence predicted for the fractal globular folding of chromatin.ConclusionsOverall, we can conclude that the three-dimensional structure of the genome is passed through generations without being dramatically changed in sperm cells.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-015-0642-0) contains supplementary material, which is available to authorized users.  相似文献   

10.
双壳贝类线粒体基因组结构的比较   总被引:4,自引:0,他引:4  
宋文涛  高祥刚  李云峰  刘卫东  刘莹  赫崇波 《遗传》2009,31(11):1127-1134
利用比较基因组学和生物信息学方法, 比较分析了已登录到GenBank中的14种海产双壳贝类和2种淡水双壳贝类的线粒体基因组的结构特征。结果发现, 双壳贝类线粒体的基因组结构、基因排列顺序均互不相同; 不同目、科和属之间线粒体基因组的大小、基因排列方式以及基因种类也存在明显的差异, 尤其是基因排列方式没有明显的规律。对16种双壳贝类的线粒体基因组全序列、编码基因序列进行系统分析, 分别得到了不同的聚类结果, 即用基因组全序列聚类时, 16种贝类的聚类结果与传统的形态学分类地位基本相同; 而将16种贝类的所有蛋白质编码基因和2个rRNA基因按照一致顺序排列起来进行聚类时, 所得的系统分类情况与这些贝类传统的形态学分类地位相差较大。  相似文献   

11.
In this paper, we are interested in the computational complexity of computing (dis)similarity measures between two genomes when they contain duplicated genes or genomic markers, a problem that happens frequently when comparing whole nuclear genomes. Recently, several methods ( [1], [2]) have been proposed that are based on two steps to compute a given (dis)similarity measure M between two genomes G_1 and G_2: first, one establishes a oneto- one correspondence between genes of G_1 and genes of G_2 ; second, once this correspondence is established, it defines explicitly a permutation and it is then possible to quantify their similarity using classical measures defined for permutations, like the number of breakpoints. Hence these methods rely on two elements: a way to establish a one-to-one correspondence between genes of a pair of genomes, and a (dis)similarity measure for permutations. The problem is then, given a (dis)similarity measure for permutations, to compute a correspondence that defines an optimal permutation for this measure. We are interested here in two models to compute a one-to-one correspondence: the exemplar model, where all but one copy are deleted in both genomes for each gene family, and the matching model, that computes a maximal correspondence for each gene family. We show that for these two models, and for three (dis)similarity measures on permutations, namely the number of common intervals, the maximum adjacency disruption (MAD) number and the summed adjacency disruption (SAD) number, the problem of computing an optimal correspondence is NP-complete, and even APXhard for the MAD number and SAD number.  相似文献   

12.
The discovery of novel viruses has often been accomplished by using hybridization-based methods that necessitate the availability of a previously characterized virus genome probe or knowledge of the viral nucleotide sequence to construct consensus or degenerate PCR primers. In their natural replication cycle, certain viruses employ a rolling-circle mechanism to propagate their circular genomes, and multiply primed rolling-circle amplification (RCA) with phi29 DNA polymerase has recently been applied in the amplification of circular plasmid vectors used in cloning. We employed an isothermal RCA protocol that uses random hexamer primers to amplify the complete genomes of papillomaviruses without the need for prior knowledge of their DNA sequences. We optimized this RCA technique with extracted human papillomavirus type 16 (HPV-16) DNA from W12 cells, using a real-time quantitative PCR assay to determine amplification efficiency, and obtained a 2.4 x 10(4)-fold increase in HPV-16 DNA concentration. We were able to clone the complete HPV-16 genome from this multiply primed RCA product. The optimized protocol was subsequently applied to a bovine fibropapillomatous wart tissue sample. Whereas no papillomavirus DNA could be detected by restriction enzyme digestion of the original sample, multiply primed RCA enabled us to obtain a sufficient amount of papillomavirus DNA for restriction enzyme analysis, cloning, and subsequent sequencing of a novel variant of bovine papillomavirus type 1. The multiply primed RCA method allows the discovery of previously unknown papillomaviruses, and possibly also other circular DNA viruses, without a priori sequence information.  相似文献   

13.
14.

Background

Multiple models have been proposed to interpret the retention of duplicated genes. In this study, we attempted to compare whether the duplicates arising from tandem duplications and retropositions are retained by the same mechanisms in human and mouse genomes.

Results

Both sequence and expression similarity analyses revealed that tandem duplicates tend to be more conserved, whereas retrogenes tend to be more divergent. The duplicability of tandem duplicates is also higher than that of retrogenes. However, positive selection seems to play significant roles in the retention of both types of duplicates.

Conclusions

We propose that dosage effect is more prevalent in the retention of tandem duplicates, while ''escape from adaptive conflict'' (EAC) effect is more prevalent in the retention of retrogenes.  相似文献   

15.
ACT: the Artemis Comparison Tool   总被引:15,自引:0,他引:15  
The Artemis Comparison Tool (ACT) allows an interactive visualisation of comparisons between complete genome sequences and associated annotations. The comparison data can be generated with several different programs; BLASTN, TBLASTX or Mummer comparisons between genomic DNA sequences, or orthologue tables generated by reciprocal FASTA comparison between protein sets. It is possible to identify regions of similarity, insertions and rearrangements at any level from the whole genome to base-pair differences. ACT uses Artemis components to display the sequences and so inherits powerful searching and analysis tools. ACT is part of the Artemis distribution and is similarly open source, written in Java and can run on any Java enabled platform, including UNIX, Macintosh and Windows.  相似文献   

16.
17.
Two complete mitochondrial genomes (mtDNAs) of chaetognaths, Spadella cephaloptera and Paraspadella gotoi, have been recently published. These genomes are highly unusual. They are the smallest metazoan mtDNAs so far known; atp6 and atp8 genes are missing; lastly, our reanalysis has evidenced that, contrarily to what has been previously published for one sequence, both contain a unique transfer RNA (tRNA(Met)) evidencing that both have the same gene content. Indeed, even if the gene order seems very different, two gene blocks are conserved. In addition, comparison of gene arrangement suggests phylogenetical relationships between chaetognaths and some lophotrochozoa like annelids and molluscs.  相似文献   

18.
Arquès DG  Lacan J  Michel CJ 《Bio Systems》2002,66(1-2):73-92
A new statistical approach using functions based on the circular code classifies correctly more than 93% of bases in protein (coding) genes and non-coding genes of human sequences. Based on this statistical study, a research software called 'Analysis of Coding Genes' (ACG) has been developed for identifying protein genes in the genomes and for determining their frame. Furthermore, the software ACG also allows an evaluation of the length of protein genes, their position in the genome, their relative position between themselves, and the prediction of internal frames in protein genes.  相似文献   

19.
CircRNAs are novel members of the non-coding RNA family. For several decades circRNAs have been known to exist, however only recently the widespread abundance has become appreciated. Annotation of circRNAs depends on sequencing reads spanning the backsplice junction and therefore map as non-linear reads in the genome. Several pipelines have been developed to specifically identify these non-linear reads and consequently predict the landscape of circRNAs based on deep sequencing datasets. Here, we use common RNAseq datasets to scrutinize and compare the output from five different algorithms; circRNA_finder, find_circ, CIRCexplorer, CIRI, and MapSplice and evaluate the levels of bona fide and false positive circRNAs based on RNase R resistance. By this approach, we observe surprisingly dramatic differences between the algorithms specifically regarding the highly expressed circRNAs and the circRNAs derived from proximal splice sites. Collectively, this study emphasizes that circRNA annotation should be handled with care and that several algorithms should ideally be combined to achieve reliable predictions.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号