首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
MOTIVATION: Duplication of genomic sequences is a common phenomenon in tumor cells. While many duplications associated with tumors have been identified (e.g. via techniques such as CGH), both the organization of the duplicated sequences and the process that leads to these duplications are less clear. One mechanism that has been observed to lead to duplication is the extraction of DNA from the chromosomes and aggregation of this DNA into small, independently replicating linear or circular DNA sequences (amplisomes). Parts of these amplisomes may later be reinserted back into the main chromosomes leading to duplication. Although amplisomes are known to play an important role in tumorigenesis, their architecture and even size remain largely unknown. RESULTS: We reconstruct the structure of tumor amplisomes by analyzing duplications in the tumor genome. Our approach relies on recently generated data from End Sequence Profiling (ESP) experiments, which allow us to examine the fine structure of duplications in a tumor on a genome-wide scale. Using ESP data, we formulate the Amplisome Reconstruction Problem, describe an algorithm for its solution, and derive a putative architecture of a tumor amplisome that is the source for duplicated material in the MCF7 breast tumor cell line.  相似文献   

2.
3.

Background  

Overlapping genes (OGs) are defined as adjacent genes whose coding sequences overlap partially or entirely. In fact, they are ubiquitous in microbial genomes and more conserved between species than non-overlapping genes. Based on this property, we have previously implemented a web server, named OGtree, that allows the user to reconstruct genome trees of some prokaryotes according to their pairwise OG distances. By analogy to the analyses of gene content and gene order, the OG distance between two genomes we defined was based on a measure of combining OG content (i.e., the normalized number of shared orthologous OG pairs) and OG order (i.e., the normalized OG breakpoint distance) in their whole genomes. A shortcoming of using the concept of breakpoints to define the OG distance is its inability to analyze the OG distance of multi-chromosomal genomes. In addition, the amount of overlapping coding sequences between some distantly related prokaryotic genomes may be limited so that it is hard to find enough OGs to properly evaluate their pairwise OG distances.  相似文献   

4.
The prospect of understanding the relationship between the genome and the physiology of an organism is an important incentive to reconstruct metabolic networks. The first steps in the process can be automated and it does not take much effort to obtain an initial metabolic reconstruction from a genome sequence. However, such a reconstruction is certainly not flawless and correction of the many imperfections is laborious. It requires the combined analysis of the available information on protein sequence, phylogeny, gene-context and co-occurrence but is also aided by high-throughput experimental data. Simultaneously, the reconstructed network provides the opportunity to visualize the "omics" data within a relevant biological functional context and thus aids the interpretation of those data.  相似文献   

5.
MOTIVATION: Gene duplications and losses (GDLs) are important events in genome evolution. They result in expansion or contraction of gene families, with a likely role in phenotypic evolution. As more genomes become available and their annotations are improved, software programs capable of rapidly and accurately identifying the content of ancestral genomes and the timings of GDLs become necessary to understand the unique evolution of each lineage. RESULTS: We report EvolMAP, a new algorithm and software that utilizes a species tree-based gene clustering method to join all-to-all symmetrical similarity comparisons of multiple gene sets in order to infer the gene composition of multiple ancestral genomes. The algorithm further uses Dollo parsimony-based comparison of the inferred ancestral genes to pinpoint the timings of GDLs onto evolutionary intervals marked by speciation events. Using EvolMAP, first we analyzed the expansion of four families of G-protein coupled receptors (GPCRs) within animal lineages. Additional to demonstrating the unique expansion tree for each family, results also show that the ancestral eumetazoan genome contained many fewer GPCRs than modern animals, and these families expanded through concurrent lineage-specific duplications. Second, we analyzed the history of GDLs in mammalian genomes by comparing seven proteomes. In agreement with previous studies, we report that the mammalian gene family sizes have changed drastically through their evolution. Interestingly, although we identified a potential source of duplication for 75% of the gained genes, remaining 25% did not have clear-cut sources, revealing thousands of genes that have likely gained their distinct sequence identities within the descent of mammals. AVAILABILITY: Query server, source code and executable are available at http://kosik-web.mcdb.ucsb.edu/evolmap/index.htm .  相似文献   

6.
Here, shotgun metagenomic sequencing was conducted to reveal the hydrogen-oxidizing autotrophic-denitrifying metabolism in an enriched Thauera-dominated consortium. A draft genome named Thauera R4 of over 90 % completeness (3.8 Mb) was retrieved mainly by a coverage-defined binning method from 3.5 Gb paired-end Illumina reads. We identified 1,263 genes (accounting for 33 % of total genes in the finished genome of Thauera aminoaromatica MZ1T) with average nucleotide identity of 87.6 % shared between Thauera R4 and T. aminoaromatica MZ1T. Although Thauera R4 and T. aminoaromatica shared quite similar nitrogen metabolism and a high nucleotide similarity (98.8 %) in their 16S ribosomal RNA genes, they showed different functional potentials in several important environmentally relevant processes. Unlike T. aminoaromatica MZ1T, Thauera R4 carries an operon of [NiFe]-hydrogenase (EC 1.12.99.6) catalyzing molecular hydrogen oxidation in nitrate-rich solution. Moreover, Thauera R4 is a mixtrophic bacterium possessing key enzymes for autotrophic CO2-fixation and heterotrophic acetate assimilation metabolism. This Thauera R4 bin provides another genetic reference to better understand the niches of Thauera and demonstrates a model pipeline to reveal functional profiles and reconstruct novel and dominant genomes from a simplified mixed culture in environmental studies.  相似文献   

7.
8.
Whole‐genome‐shotgun (WGS) sequencing of total genomic DNA was used to recover ~1 Mbp of novel mitochondrial (mtDNA) sequence from Pinus sylvestris (L.) and three members of the closely related Pinus mugo species complex. DNA was extracted from megagametophyte tissue from six mother trees from locations across Europe, and 100‐bp paired‐end sequencing was performed on the Illumina HiSeq platform. Candidate mtDNA sequences were identified by their size and coverage characteristics, and by comparison with published plant mitochondrial genomes. Novel variants were identified, and primers targeting these loci were trialled on a set of 28 individuals from across Europe. In total, 31 SNP loci were successfully resequenced, characterizing 15 unique haplotypes. This approach offers a cost‐effective means of developing marker resources for mitochondrial genomes in other plant species where reference sequences are unavailable.  相似文献   

9.
Tumors contain multiple subpopulations of genetically distinct cancer cells. Reconstructing their evolutionary history can improve our understanding of how cancers develop and respond to treatment. Subclonal reconstruction methods cluster mutations into groups that co-occur within the same subpopulations, estimate the frequency of cells belonging to each subpopulation, and infer the ancestral relationships among the subpopulations by constructing a clone tree. However, often multiple clone trees are consistent with the data and current methods do not efficiently capture this uncertainty; nor can these methods scale to clone trees with a large number of subclonal populations.Here, we formalize the notion of a partially-defined clone tree (partial clone tree for short) that defines a subset of the pairwise ancestral relationships in a clone tree, thereby implicitly representing the set of all clone trees that have these defined pairwise relationships. Also, we introduce a special partial clone tree, the Maximally-Constrained Ancestral Reconstruction (MAR), which summarizes all clone trees fitting the input data equally well. Finally, we extend commonly used clone tree validity conditions to apply to partial clone trees and describe SubMARine, a polynomial-time algorithm producing the subMAR, which approximates the MAR and guarantees that its defined relationships are a subset of those present in the MAR. We also extend SubMARine to work with subclonal copy number aberrations and define equivalence constraints for this purpose. Further, we extend SubMARine to permit noise in the estimates of the subclonal frequencies while retaining its validity conditions and guarantees. In contrast to other clone tree reconstruction methods, SubMARine runs in time and space that scale polynomially in the number of subclones.We show through extensive noise-free simulation, a large lung cancer dataset and a prostate cancer dataset that the subMAR equals the MAR in all cases where only a single clone tree exists and that it is a perfect match to the MAR in most of the other cases. Notably, SubMARine runs in less than 70 seconds on a single thread with less than one Gb of memory on all datasets presented in this paper, including ones with 50 nodes in a clone tree. On the real-world data, SubMARine almost perfectly recovers the previously reported trees and identifies minor errors made in the expert-driven reconstructions of those trees.The freely-available open-source code implementing SubMARine can be downloaded at https://github.com/morrislab/submarine.  相似文献   

10.
11.
12.
13.
We determined the complete nucleotide sequence of the human endogenous retrovirus genome HERV-K10 isolated as the sequence homologous to the Syrian hamster intracisternal A-particle (type A retrovirus) genome. HERV-K10 is 9,179 base pairs long with long terminal repeats of 968 base pairs at both ends; a sequence 290 base pairs long, however, was found to be deleted. It was concluded that a composite genome having the 290-base-pair fragment is the prototype HERV-K provirus gag (666 codons), protease (334 codons), pol (937 codons), and env (618 codons) genes. The size of the protease gene product of HERV-K is essentially the same as that of A- and D-type oncoviruses but nearly twice that of other retroviruses. A comparison of the deduced amino acid sequences encoded by the pol region showed HERV-K to be closely related to types A and D retroviruses and even more so to type B retrovirus. It was noted that the env gene product of HERV-K structurally resembles the mouse mammary tumor virus (type B retrovirus) env protein, and the possible expression of the HERV-K env gene in human breast cancer cells is discussed.  相似文献   

14.
Odorant signal transduction and neurogenesis are fundamental properties of the olfactory epithelium. Many preparations have been used to elucidate some of the mechanisms underlying these properties. In this article, we briefly review these research areas and describe some of the techniques used to obtain the data. We focus specifically on the cell-culture paradigm and the data obtained from various immortal cell lines in their attempts to reconstruct the olfactory epithelium in vitro.  相似文献   

15.
16.
Restriction enzyme sites on the avian RNA tumor virus genome.   总被引:22,自引:12,他引:10       下载免费PDF全文
J M Taylor  T W Hsu    M M Lai 《Journal of virology》1978,26(2):479-484
  相似文献   

17.
18.
19.
20.
Novel approaches to bio-imaging and automated computational image processing allow the design of truly quantitative studies in developmental biology. Cell behavior, cell fate decisions, cell interactions during tissue morphogenesis, and gene expression dynamics can be analyzed in vivo for entire complex organisms and throughout embryonic development. We review state-of-the-art technology for live imaging, focusing on fluorescence light microscopy techniques for system-level investigations of animal development, and discuss computational approaches to image segmentation, cell tracking, automated data annotation, and biophysical modeling. We argue that the substantial increase in data complexity and size requires sophisticated new strategies to data analysis to exploit the enormous potential of these new resources.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号