首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In recent years, much effort has been devoted to understanding the three-dimensional (3D) organization of the genome and how genomic structure mediates nuclear function. The development of experimental techniques that combine DNA proximity ligation with high-throughput sequencing, such as Hi-C, have substantially improved our knowledge about chromatin organization. Numerous experimental advancements, not only utilizing DNA proximity ligation but also high-resolution genome imaging (DNA tracing), have required theoretical modeling to determine the structural ensembles consistent with such data. These 3D polymer models of the genome provide an understanding of the physical mechanisms governing genome architecture. Here, we present an overview of the recent advances in modeling the ensemble of 3D chromosomal structures by employing the maximum entropy approach combined with polymer physics. Particularly, we discuss the minimal chromatin model (MiChroM) along with the “maximum entropy genomic annotations from biomarkers associated with structural ensembles” (MEGABASE) model, which have been remarkably successful in the accurate modeling of chromosomes consistent with both Hi-C and DNA-tracing data.  相似文献   

2.
3.
Knowledge about the 3D organization of the genome will offer great insights into how cells retrieve and process the genetic information. Knowing the spatial probability distributions of individual genes will provide insights into gene regulatory and replication processes, and fill in the missing links between epigenomics, functional genomics, and structural biology. We will discuss an approach to determine 3D genome structures and structure–function maps of genomes by integrating divers types of data. To address the challenge of modeling highly variable genome structures, we discuss a population-based modeling approach, where we construct a large population of 3D genome structures that together are entirely consistent with all available experimental data including data from genome-wide chromosome conformation capture and imaging experiments. We interpret the result in terms of probabilities of a sample drawn from a population of heterogeneous structures. We will discuss results on the 3D spatial organization of genomes in human lymphoblastoid cells and budding yeast.  相似文献   

4.
近年来,随着高通量染色体构象捕获(Hi-C)等技术的发展和高通量测序成本的降低,全基因组交互作用的数据量快速增长,交互作用图谱分辨率不断提高,促使染色体和基因组三维结构建模的研究取得了很大进展,已经提出了几种从染色体构象捕捉数据中构建单个染色体或整个基因组结构的方法。文中通过对在 Hi-C 数据基础上对染色体三维结构重建的相关文献进行分析,总结了重建染色体三维空间结构的经典算法3DMax的原理,并且提出了一种新的随机梯度上升算法:XNadam,是Nadam优化方法的一个变体,将其应用于3DMax算法中,以便提高3DMax算法的性能,从而用于预测染色体三维结构。  相似文献   

5.
6.
Eukaryotes appear to evolve by micro and macro rearrangements. This is observed not only for long-term evolutionary adaptation, but also in short-term experimental evolution of yeast, Saccharomyces cerevisiae. Moreover, based on these and other experiments it has been postulated that repeat elements, retroposons for example, mediate such events. We study an evolutionary model in which genomes with retroposons and a breaking/repair mechanism are subjected to a changing environment. We show that retroposon-mediated rearrangements can be a beneficial mutational operator for short-term adaptations to a new environment. But simply having the ability of rearranging chromosomes does not imply an advantage over genomes in which only single-gene insertions and deletions occur. Instead, a structuring of the genome is needed: genes that need to be amplified (or deleted) in a new environment have to cluster. We show that genomes hosting retroposons, starting with a random order of genes, will in the long run become organized, which enables (fast) rearrangement-based adaptations to the environment. In other words, our model provides a "proof of principle" that genomes can structure themselves in order to increase the beneficial effect of chromosome rearrangements.  相似文献   

7.
We have been developing FAMSBASE, a protein homology-modeling database of whole ORFs predicted from genome sequences. The latest update of FAMSBASE (), which is based on the protein three-dimensional (3D) structures released by November 2003, contains modeled 3D structures for 368,724 open reading frames (ORFs) derived from genomes of 276 species, namely 17 archaebacterial, 130 eubacterial, 18 eukaryotic and 111 phage genomes. Those 276 genomes are predicted to have 734,193 ORFs in total and the current FAMSBASE contains protein 3D structure of approximately 50% of the ORF products. However, cases that a modeled 3D structure covers the whole part of an ORF product are rare. When portion of an ORF with 3D structure is compared in three kingdoms of life, in archaebacteria and eubacteria, approximately 60% of the ORFs have modeled 3D structures covering almost the entire amino acid sequences, however, the percentage falls to about 30% in eukaryotes. When annual differences in the number of ORFs with modeled 3D structure are calculated, the fraction of modeled 3D structures of soluble protein for archaebacteria is increased by 5%, and that for eubacteria by 7% in the last 3 years. Assuming that this rate would be maintained and that determination of 3D structures for predicted disordered regions is unattainable, whole soluble protein model structures of prokaryotes without the putative disordered regions will be in hand within 15 years. For eukaryotic proteins, they will be in hand within 25 years. The 3D structures we will have at those times are not the 3D structure of the entire proteins encoded in single ORFs, but the 3D structures of separate structural domains. Measuring or predicting spatial arrangements of structural domains in an ORF will then be a coming issue of structural genomics.  相似文献   

8.
9.
While recently developed short-read sequencing technologies may dramatically reduce the sequencing cost and eventually achieve the $1000 goal for re-sequencing, their limitations prevent the de novo sequencing of eukaryotic genomes with the standard shotgun sequencing protocol. We present SHRAP (SHort Read Assembly Protocol), a sequencing protocol and assembly methodology that utilizes high-throughput short-read technologies. We describe a variation on hierarchical sequencing with two crucial differences: (1) we select a clone library from the genome randomly rather than as a tiling path and (2) we sample clones from the genome at high coverage and reads from the clones at low coverage. We assume that 200 bp read lengths with a 1% error rate and inexpensive random fragment cloning on whole mammalian genomes is feasible. Our assembly methodology is based on first ordering the clones and subsequently performing read assembly in three stages: (1) local assemblies of regions significantly smaller than a clone size, (2) clone-sized assemblies of the results of stage 1, and (3) chromosome-sized assemblies. By aggressively localizing the assembly problem during the first stage, our method succeeds in assembling short, unpaired reads sampled from repetitive genomes. We tested our assembler using simulated reads from D. melanogaster and human chromosomes 1, 11, and 21, and produced assemblies with large sets of contiguous sequence and a misassembly rate comparable to other draft assemblies. Tested on D. melanogaster and the entire human genome, our clone-ordering method produces accurate maps, thereby localizing fragment assembly and enabling the parallelization of the subsequent steps of our pipeline. Thus, we have demonstrated that truly inexpensive de novo sequencing of mammalian genomes will soon be possible with high-throughput, short-read technologies using our methodology.  相似文献   

10.

Background

Several genomes have now been sequenced, with millions of genetic variants annotated. While significant progress has been made in mapping single nucleotide polymorphisms (SNPs) and small (<10 bp) insertion/deletions (indels), the annotation of larger structural variants has been less comprehensive. It is still unclear to what extent a typical genome differs from the reference assembly, and the analysis of the genomes sequenced to date have shown varying results for copy number variation (CNV) and inversions.

Results

We have combined computational re-analysis of existing whole genome sequence data with novel microarray-based analysis, and detect 12,178 structural variants covering 40.6 Mb that were not reported in the initial sequencing of the first published personal genome. We estimate a total non-SNP variation content of 48.8 Mb in a single genome. Our results indicate that this genome differs from the consensus reference sequence by approximately 1.2% when considering indels/CNVs, 0.1% by SNPs and approximately 0.3% by inversions. The structural variants impact 4,867 genes, and >24% of structural variants would not be imputed by SNP-association.

Conclusions

Our results indicate that a large number of structural variants have been unreported in the individual genomes published to date. This significant extent and complexity of structural variants, as well as the growing recognition of their medical relevance, necessitate they be actively studied in health-related analyses of personal genomes. The new catalogue of structural variants generated for this genome provides a crucial resource for future comparison studies.  相似文献   

11.
结构基因组学研究与核磁共振   总被引:4,自引:0,他引:4  
各种生物的基因组DNA测序计划的完成,将结构生物学带入了结构基因组学时代.结构基因组学是对所有基因组产物结构的系统性测定,它运用高通量的选择、表达、纯化以及结构测定和计算分析手段,为基因组的每个蛋白质产物提供实验测定的结构或较好的理论模型,这将加速生命科学各个领域的研究.生物信息学、基因工程、结构测定技术等的发展为结构基因组学研究提供了保证.近年来核磁共振在技术方法上的进展,使其成为结构基因组学高通量结构分析中的一个关键方法.  相似文献   

12.
刘沛峰  吴强 《遗传》2020,(1):18-31
CRISPR/Cas9系统在基因编辑方面具有巨大优势,能够低成本、可编程、方便快捷地用于动物、植物以及微生物的基因组靶向编辑和功能改造。三维基因组学是近年来兴起的一门研究染色质高级结构动态调控及基因组生物学功能的交叉学科。在三维基因组研究中,通常采用对DNA片段进行基因编辑以模拟基因组结构性变异,标记特定DNA片段,进而研究调控元件对于基因调控、细胞分化、组织发生、器官形成、个体发育的影响,最终阐明三维基因组的组装调控机制和生物学功能。因此,CRISPR及其衍生技术为研究三维基因组提供了极好的遗传学工具。本文主要综述了CRISPR片段编辑及其衍生技术在三维基因组调控与功能研究中的应用,以期为后续研究工作提供理论参考以及新的研究思路。  相似文献   

13.
The large amount of image data necessary for high-resolution 3D reconstruction of macromolecular assemblies leads to significant increases in the computational time. One of the most time consuming operations is 3D density map reconstruction, and software optimization can greatly reduce the time required for any given structural study. The majority of algorithms proposed for improving the computational effectiveness of a 3D reconstruction are based on a ray-by-ray projection of each image into the reconstructed volume. In this paper, we propose a novel fast implementation of the "filtered back-projection" algorithm based on a voxel-by-voxel principle. Our version of this implementation has been exhaustively tested using both model and real data. We compared 3D reconstructions obtained by the new approach with results obtained by the filtered Back-Projections algorithm and the Fourier-Bessel algorithm commonly used for reconstructing icosahedral viruses. These computational experiments demonstrate the robustness, reliability, and efficiency of this approach.  相似文献   

14.
Restraint-based modeling of genomes has been recently explored with the advent of Chromosome Conformation Capture (3C-based) experiments. We previously developed a reconstruction method to resolve the 3D architecture of both prokaryotic and eukaryotic genomes using 3C-based data. These models were congruent with fluorescent imaging validation. However, the limits of such methods have not systematically been assessed. Here we propose the first evaluation of a mean-field restraint-based reconstruction of genomes by considering diverse chromosome architectures and different levels of data noise and structural variability. The results show that: first, current scoring functions for 3D reconstruction correlate with the accuracy of the models; second, reconstructed models are robust to noise but sensitive to structural variability; third, the local structure organization of genomes, such as Topologically Associating Domains, results in more accurate models; fourth, to a certain extent, the models capture the intrinsic structural variability in the input matrices and fifth, the accuracy of the models can be a priori predicted by analyzing the properties of the interaction matrices. In summary, our work provides a systematic analysis of the limitations of a mean-field restrain-based method, which could be taken into consideration in further development of methods as well as their applications.  相似文献   

15.
Hexaploid triticale (×Triticosecale Wittmack) lines were examined using molecular markers and the hybridization in situ technique. Triticale lines were generated based on wheat varieties differing by the Vrn gene systems and the earing times. Molecular analysis was performed using Xgwm and Xrms microsatellite markers with the known chromosomal localization in the common wheat Triticum aestivum, and rye Secale cereale genomes. Comparative molecular analysis of triticale lines and their parental forms showed that all lines contained A and B genomes of common wheat and also rye homoeologous chromosomes. In the three lines the presence of D genome markers, mapped to the chromosomes 2D and 7D, was demonstrated. This was probably the consequence of the translocations of homoeologous chromosomes from wheat genomes, which took part during the process of triticale formation. The data obtained by use of genomic in situ hybridization supported the data of molecular genetic analysis. In none of the lines wheat-rye translocations or recombinations were observed. These findings suggest that the change of the period between the seedling appearance and earing time in triticale lines compared to the initial wheat lines, resulted from the inhibitory effect of rye genome on wheat vernalization genes.  相似文献   

16.
Chromosomes are not positioned randomly within a nucleus, but instead, they adopt preferred spatial conformations to facilitate necessary long-range gene–gene interactions and regulations. Thus, obtaining the 3D shape of chromosomes of a genome is critical for understanding how the genome folds, functions and how its genes interact and are regulated. Here, we describe a method to reconstruct preferred 3D structures of individual chromosomes of the human genome from chromosomal contact data generated by the Hi-C chromosome conformation capturing technique. A novel parameterized objective function was designed for modeling chromosome structures, which was optimized by a gradient descent method to generate chromosomal structural models that could satisfy as many intra-chromosomal contacts as possible. We applied the objective function and the corresponding optimization method to two Hi-C chromosomal data sets of both a healthy and a cancerous human B-cell to construct 3D models of individual chromosomes at resolutions of 1 MB and 200 KB, respectively. The parameters used with the method were calibrated according to an independent fluorescence in situ hybridization experimental data. The structural models generated by our method could satisfy a high percentage of contacts (pairs of loci in interaction) and non-contacts (pairs of loci not in interaction) and were compatible with the known two-compartment organization of human chromatin structures. Furthermore, structural models generated at different resolutions and from randomly permuted data sets were consistent.  相似文献   

17.
Differences between individual human genomes, or between human and cancer genomes, range in scale from single nucleotide variants (SNVs) through intermediate and large-scale duplications, deletions, and rearrangements of genomic segments. The latter class, called structural variants (SVs), have received considerable attention in the past several years as they are a previously under appreciated source of variation in human genomes. Much of this recent attention is the result of the availability of higher-resolution technologies for measuring these variants, including both microarray-based techniques, and more recently, high-throughput DNA sequencing. We describe the genomic technologies and computational techniques currently used to measure SVs, focusing on applications in human and cancer genomics.

What to Learn in This Chapter

  • Current knowledge about the prevalence of structural variation in human and cancer genomes.
  • Strategies for using microarray and high-throughput DNA sequencing technologies to measure structural variation.
  • Computational techniques to detect structural variants from DNA sequencing data.
This article is part of the “Translational Bioinformatics” collection for PLOS Computational Biology.
  相似文献   

18.

Background

Knowledge of the origins, distribution, and inheritance of variation in the malaria parasite (Plasmodium falciparum) genome is crucial for understanding its evolution; however the 81% (A+T) genome poses challenges to high-throughput sequencing technologies. We explore the viability of the Roche 454 Genome Sequencer FLX (GS FLX) high throughput sequencing technology for both whole genome sequencing and fine-resolution characterization of genetic exchange in malaria parasites.

Results

We present a scheme to survey recombination in the haploid stage genomes of two sibling parasite clones, using whole genome pyrosequencing that includes a sliding window approach to predict recombination breakpoints. Whole genome shotgun (WGS) sequencing generated approximately 2 million reads, with an average read length of approximately 300 bp. De novo assembly using a combination of WGS and 3 kb paired end libraries resulted in contigs ≤ 34 kb. More than 8,000 of the 24,599 SNP markers identified between parents were genotyped in the progeny, resulting in a marker density of approximately 1 marker/3.3 kb and allowing for the detection of previously unrecognized crossovers (COs) and many non crossover (NCO) gene conversions throughout the genome.

Conclusions

By sequencing the 23 Mb genomes of two haploid progeny clones derived from a genetic cross at more than 30× coverage, we captured high resolution information on COs, NCOs and genetic variation within the progeny genomes. This study is the first to resequence progeny clones to examine fine structure of COs and NCOs in malaria parasites.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号