首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We developed 'clipping reveals structure' (CREST), an algorithm that uses next-generation sequencing reads with partial alignments to a reference genome to directly map structural variations at the nucleotide level of resolution. Application of CREST to whole-genome sequencing data from five pediatric T-lineage acute lymphoblastic leukemias (T-ALLs) and a human melanoma cell line, COLO-829, identified 160 somatic structural variations. Experimental validation exceeded 80%, demonstrating that CREST had a high predictive accuracy.  相似文献   

2.
Genome structural variation (SV) is a major source of genetic diversity in mammals and a hallmark of cancer. Although SV is typically defined by its canonical forms (duplication, deletion, insertion, inversion and translocation), recent breakpoint mapping studies have revealed a surprising number of 'complex' variants that evade simple classification. Complex SVs are defined by clustered breakpoints that arose through a single mutation but cannot be explained by one simple end-joining or recombination event. Some complex variants exhibit profoundly complicated rearrangements between distinct loci from multiple chromosomes, whereas others involve more subtle alterations at a single locus. These diverse and unpredictable features present a challenge for SV mapping experiments. Here, we review current knowledge of complex SV in mammals, and outline techniques for identifying and characterizing complex variants using next-generation DNA sequencing.  相似文献   

3.
4.
This Hot Topics contribution considers two recently published papers that demonstrate the utility of advanced DNA sequencing technologies for identifying classes of mutations other than base substitutions. Data are presented from genome analyses of immortalized cell lines derived from a malignant melanoma and a small cell carcinoma of the lung. Among other observations the studies suggest the operation of novel DNA repair mechanisms or modes.  相似文献   

5.
The human genome reference (HGR) completion marked the genomics era beginning, yet despite its utility universal application is limited by the small number of individuals used in its development. This is highlighted by the presence of high-quality sequence reads failing to map within the HGR. Sequences failing to map generally represent 2–5 % of total reads, which may harbor regions that would enhance our understanding of population variation, evolution, and disease. Alternatively, complete de novo assemblies can be created, but these effectively ignore the groundwork of the HGR. In an effort to find a middle ground, we developed a bioinformatic pipeline that maps paired-end reads to the HGR as separate single reads, exports unmappable reads, de novo assembles these reads per individual and then combines assemblies into a secondary reference assembly used for comparative analysis. Using 45 diverse 1000 Genomes Project individuals, we identified 351,361 contigs covering 195.5 Mb of sequence unincorporated in GRCh38. 30,879 contigs are represented in multiple individuals with ~40 % showing high sequence complexity. Genomic coordinates were generated for 99.9 %, with 52.5 % exhibiting high-quality mapping scores. Comparative genomic analyses with archaic humans and primates revealed significant sequence alignments and comparisons with model organism RefSeq gene datasets identified novel human genes. If incorporated, these sequences will expand the HGR, but more importantly our data highlight that with this method low coverage (~10–20×) next-generation sequencing can still be used to identify novel unmapped sequences to explore biological functions contributing to human phenotypic variation, disease and functionality for personal genomic medicine.  相似文献   

6.
The repertoire of biosynthetic enzymes found in an organism is an important clue for elucidating the chemical structural variations of various compounds. In the case of fatty acids, it is essential to examine key enzymes that are desaturases and elongases, whose combination determine the range of fatty acid structures. We systematically investigated 56 eukaryotic genomes to obtain 275 desaturase and 265 elongase homologs. Phylogenetic and motif analysis indicated that the desaturases consisted of four functionally distinct subfamilies and the elongases consisted of two subfamilies. From the combination of the subfamilies, we then predicted the ability to synthesize six types of fatty acids. Consequently, we found that the ranges of synthesizable fatty acids were often different even between closely related organisms. The reason is that, as well as diverging into subfamilies, the enzymes have functionally diverged within the individual subfamilies. Finally, we discuss how the adaptation to individual environments and the ability to synthesize specific metabolites provides some explanation for the diversity of enzyme functions. This study provides an example of a potent strategy to bridge the gap from genomic knowledge to chemical knowledge.  相似文献   

7.
《Molecular cell》2023,83(5):731-745.e4
  1. Download : Download high-res image (334KB)
  2. Download : Download full-size image
  相似文献   

8.
9.
A database search often will find a seemingly strong sequence similarity between two fragments of proteins that are not expected to have an evolutionary or functional relationship. It is tempting to suggest that the two fragments will adopt a similar conformation due to a common pattern of residues that dictate a particular substructure. To investigate the likelihood of such a structural similarity, local sequence similarities between proteins of known conformation were identified by a standard database search algorithm. Significant sequence similarity was identified as when the chance probability of obtaining the relatedness score from a scan of the entire database was less than 1%. In this region both true homologies and false homologies are detected. A total of 69 false homologies was located of length between 20 and 262 aligned positions. Many of these alignments had approximately 25% sequence identity and a further 25% of conservative changes. However, the results show in general these aligned fragments did not have a significant similarity in secondary or tertiary structure. Thus local sequence does not indicate a structural similarity when there is neither an evolutionary nor functional explanation to support this. Accordingly structure predictions based on finding a local sequence similarity with an evolutionary unrelated protein of known conformation are unlikely to be valid.  相似文献   

10.
11.
BackgroundSNPs are the most abundant polymorphism type, and have been explored in many crop genomic studies, including rice and maize. SNP discovery in allotetraploid cotton genomes has lagged behind that of other crops due to their complexity and polyploidy. In this study, genome-wide SNPs are detected systematically using next-generation sequencing and efficient SNP genotyping methods, and used to construct a linkage map and characterize the structural variations in polyploid cotton genomes.ResultsWe construct an ultra-dense inter-specific genetic map comprising 4,999,048 SNP loci distributed unevenly in 26 allotetraploid cotton linkage groups and covering 4,042 cM. The map is used to order tetraploid cotton genome scaffolds for accurate assembly of G. hirsutum acc. TM-1. Recombination rates and hotspots are identified across the cotton genome by comparing the assembled draft sequence and the genetic map. Using this map, genome rearrangements and centromeric regions are identified in tetraploid cotton by combining information from the publicly-available G. raimondii genome with fluorescent in situ hybridization analysis.ConclusionsWe report the genotype-by-sequencing method used to identify millions of SNPs between G. hirsutum and G. barbadense. We construct and use an ultra-dense SNP map to correct sequence mis-assemblies, merge scaffolds into pseudomolecules corresponding to chromosomes, detect genome rearrangements, and identify centromeric regions in allotetraploid cottons. We find that the centromeric retro-element sequence of tetraploid cotton derived from the D subgenome progenitor might have invaded the A subgenome centromeres after allotetrapolyploid formation. This study serves as a valuable genomic resource for genetic research and breeding of cotton.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-015-0678-1) contains supplementary material, which is available to authorized users.  相似文献   

12.
The benzomorphan scaffold has great potential as lead structure and the nature of the N-substituent is able to influence affinity, potency, and efficacy at all three opioid receptors. Building upon these considerations, we synthesized a new series of LP1 analogues by introducing naphthyl or heteroaromatic rings in propanamide side chain of its N-substituent (915). In vitro competition-binding assays in HEK293 cells stably expressing MOR, DOR or KOR showed that in compound 9 the 1-naphthyl ring led to the retention of MOR affinity (KiMOR = 38 ± 4 nM) displaying good selectivity versus DOR and KOR. In the electrically stimulated GPI, compound 9 was inactive as agonist but produced an antagonist potency value (pA2) of 8.6 in presence of MOR agonist DAMGO. Moreover, subcutaneously administered it antagonized the antinociceptive effects of morphine with an AD50 = 2.0 mg/kg in mouse-tail flick test. Modeling studies on MOR revealed that compound 9 fit very well in the binding pocket but in a different way in respect to the agonist LP1. Probably the replacement of its N-substituent on the III, IV and V TM domains reflects an antagonist behavior. Therefore, compound 9 could represent a potential lead to further develop antagonists as valid therapeutic agents and useful pharmacological tools to study opioid receptor function.  相似文献   

13.
14.
Lisi V  Major F 《RNA (New York, N.Y.)》2007,13(9):1537-1545
Despite an increasing number of experimentally determined RNA structures, the gap between the number of structures and that of RNA families is still growing. To overcome this limitation, efficient and reliable RNA modeling methodologies must be developed. In order to reach this goal, here, we show how triloop sequence-structure relationships have been inferred through a systematic analysis of all triloops found in available high-resolution structures. The structural annotation of all triloops allowed us to define discrete states of the triloop's conformational space, and therefore an explicit sequence-to-structure relation. The sequence-structure relationships inferred from this explicit relation are presented in a convenient modeling table that provides a limited set of possible three-dimensional structures given any triloop sequence. The table is indexed by the two nucleotides that form the triloop's flanking base pair, since they are shown to provide the most information about the triloop three-dimensional structures. We also report the observations in the X-ray crystallographic structures of important conformational variations, which we believe might be the result of RNA dynamic.  相似文献   

15.
Medicago is a genus of legumes (Fabaceae) that resemble common clovers with pinnately trifoliate leaves and spirally coiled seed pods, and Medicago sativa is a famous forage crop throughout the world. In this study, we systematically assembled the complete plastid genomes of 18 Medicago species, representing 35 Medicago accessions, whose genome size ranged from ~119 to 125 kb, and identified one novel inverted repeat (IR) in two accessions of Medicago soleirolii (PI537242 and PI537243), albeit of no IRs in the most accessions. We built a phylogenetic tree based on common protein-coding sequences of 55 Medicago accessions in 38 species, which were placed into five clades with a divergence since 9.37 million years ago. Global alignment revealed independent genome evolution events, including eight inversions in nine species and four intron losses (ILs) in 10 species, among which four inversions and two ILs have not been reported previously. Within 109–111 unique genes, ndhA, rpl2, and ycf3 were under positive selection in 54 Medicago accessions. Finally, by aligning chloroplast genes against the nuclear genome assembly of M. sativa cultivar “Zhongmu No.1”, we found that a large number of chloroplast gene fragments were horizontally transferred to nuclear chromosomes in alfalfa, especially on the chr3:47518422–48722257 coordinates of chromosome 3. Our comprehensive exploration of Medicago chloroplast genomes provided insights for the understanding of Medicago diversity and their genomic evolution events.  相似文献   

16.
The assembly of a reference genome sequence of bread wheat is challenging due to its specific features such as the genome size of 17 Gbp, polyploid nature and prevalence of repetitive sequences. BAC‐by‐BAC sequencing based on chromosomal physical maps, adopted by the International Wheat Genome Sequencing Consortium as the key strategy, reduces problems caused by the genome complexity and polyploidy, but the repeat content still hampers the sequence assembly. Availability of a high‐resolution genomic map to guide sequence scaffolding and validate physical map and sequence assemblies would be highly beneficial to obtaining an accurate and complete genome sequence. Here, we chose the short arm of chromosome 7D (7DS) as a model to demonstrate for the first time that it is possible to couple chromosome flow sorting with genome mapping in nanochannel arrays and create a de novo genome map of a wheat chromosome. We constructed a high‐resolution chromosome map composed of 371 contigs with an N50 of 1.3 Mb. Long DNA molecules achieved by our approach facilitated chromosome‐scale analysis of repetitive sequences and revealed a ~800‐kb array of tandem repeats intractable to current DNA sequencing technologies. Anchoring 7DS sequence assemblies obtained by clone‐by‐clone sequencing to the 7DS genome map provided a valuable tool to improve the BAC‐contig physical map and validate sequence assembly on a chromosome‐arm scale. Our results indicate that creating genome maps for the whole wheat genome in a chromosome‐by‐chromosome manner is feasible and that they will be an affordable tool to support the production of improved pseudomolecules.  相似文献   

17.
Transmission of organelle genomes in citrus somatic hybrids   总被引:3,自引:0,他引:3  
Restriction fragment length polymorphisms (RFLPs), were used to analyze the organelle composition of two-year-old trees, recovered from two different experiments: protoplasts from embryogenic cell suspensions of `Succari' sweet orange (C. sinensis L. Osbeck) were fused with leaf protoplasts of Citropsis gilletiana Swingle & M. Kell or to leaf protoplasts of Atalantia ceylanica(Arn.) Oliv. The somatic hybrids of both fusion combinations had the mitochondrial genome from the embryogenic partner. In some somatic hybrids, non-parental fragments were observed among the mitochondrial patterns. Somatic hybrids between `Succari' + Atalantia had plastid DNA from the embryogenic parent, while the somatic hybrids of `Succari' + Citropsis all had both parental chloroplast genomes. The relative abundance of organelle DNAs in the donor embryogenic and leaf cells may explain the consistent transmission of the embryogenic parent mitochondrial DNA and the inheritance of the chloroplast genome from either parent. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

18.
Inheritance of organelle genomes in citrus somatic cybrids   总被引:4,自引:0,他引:4  
Restriction fragment length polymorphisms (RFLPs) were used for the characterization of citrus organelle inheritance in somatic cybrids produced during six different citrus protoplast fusions. All the cybrids in this work inherited their mitochondrial genome from the embryogenic fusion partner (callus or cell suspension). In some of the combinations, non-parental bands were observed among the mitochondrial configurations. In contrast, the cybrids inherited plastid DNA from either the embryogenic or the nonembryogenic (leaf) fusion partner. The relative abundance of organelle DNAs in the embryogenic and leaf cells was in accordance with these inheritance patterns. Stochastic processes may therefore influence the outcome of somatic cell fusions with respect to organelle genomes.  相似文献   

19.
Secondary structure remains the most exploitable feature for noncoding RNA (ncRNA) gene finding in genomes. However, methods based on secondary structure prediction may generate superfluous amount of candidates for validation and have yet to deliver the desired performance that can complement experimental efforts in ncRNA gene finding. This paper investigates a novel method, unpaired structural entropy (USE) as a measurement for the structure fold stability of ncRNAs. USE proves to be effective in identifying from the genome background a class of ncRNAs, such as precursor microRNAs (pre-miRNAs) that contains a long stem hairpin loop. USE correlates well and performs better than other measures on pre-miRNAs, including the previously formulated structural entropy. As an SVM classifier, USE outperforms existing pre-miRNA classifiers. A long stem hairpin loop is common for a number of other functional RNAs including introns splicing hairpins loops and intrinsic termination hairpin loops. We believe USE can be further applied in developing ab initio prediction programs for a larger class of ncRNAs.  相似文献   

20.
Wang CY  Li H  Hao XD  Liu J  Wang JX  Wang WZ  Kong QP  Zhang YP 《PloS one》2011,6(6):e21613
In the past decade, a high incidence of somatic mitochondrial DNA (mtDNA) mutations has been observed, mostly based on a fraction of the molecule, in various cancerous tissues; nevertheless, some of them were queried due to problems in data quality. Obviously, without a comprehensive understanding of mtDNA mutational profile in the cancerous tissue of a specific patient, it is unlikely to disclose the genuine relationship between somatic mtDNA mutations and tumorigenesis. To achieve this objective, the most straightforward way is to directly compare the whole mtDNA genome variation among three tissues (namely, cancerous tissue, para-cancerous tissue, and distant normal tissue) from the same patient. Considering the fact that most of the previous studies on the role of mtDNA in colorectal tumor focused merely on the D-loop or partial segment of the molecule, in the current study we have collected three tissues (cancerous, para-cancerous and normal tissues) respectively recruited from 20 patients with colorectal tumor and completely sequenced the mitochondrial genome of each tissue. Our results reveal a relatively lower incidence of somatic mutations in these patients; intriguingly, all somatic mutations are in heteroplasmic status. Surprisingly, the observed somatic mutations are not restricted to cancer tissues, for the para-cancer tissues and distant normal tissues also harbor somatic mtDNA mutations with a lower frequency than cancerous tissues but higher than that observed in the general population. Our results suggest that somatic mtDNA mutations in cancerous tissues could not be simply explained as a consequence of tumorigenesis; meanwhile, the somatic mtDNA mutations in normal tissues might reflect an altered physiological environment in cancer patients.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号