首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 312 毫秒
1.

Background

Accurate catalogs of structural variants (SVs) in mammalian genomes are necessary to elucidate the potential mechanisms that drive SV formation and to assess their functional impact. Next generation sequencing methods for SV detection are an advance on array-based methods, but are almost exclusively limited to four basic types: deletions, insertions, inversions and copy number gains.

Results

By visual inspection of 100 Mbp of genome to which next generation sequence data from 17 inbred mouse strains had been aligned, we identify and interpret 21 paired-end mapping patterns, which we validate by PCR. These paired-end mapping patterns reveal a greater diversity and complexity in SVs than previously recognized. In addition, Sanger-based sequence analysis of 4,176 breakpoints at 261 SV sites reveal additional complexity at approximately a quarter of structural variants analyzed. We find micro-deletions and micro-insertions at SV breakpoints, ranging from 1 to 107 bp, and SNPs that extend breakpoint micro-homology and may catalyze SV formation.

Conclusions

An integrative approach using experimental analyses to train computational SV calling is essential for the accurate resolution of the architecture of SVs. We find considerable complexity in SV formation; about a quarter of SVs in the mouse are composed of a complex mixture of deletion, insertion, inversion and copy number gain. Computational methods can be adapted to identify most paired-end mapping patterns.  相似文献   

2.
3.
A set of mutated SV40 early polyadenylation signals (SV40pA) with varying strengths is generated by mutating the AATAAA sequence in the wild-type SV40pA. They are shown to control the expression level of a gene over a 10-fold range using luciferase reporter genes in transient transfection assays. The relative strength of these SV40pA variants remains similar under three commonly used mammalian promoters and in five mammalian cell lines. Application of SV40pA variants for controlling expression level of multiple genes is demonstrated in a study of monoclonal antibody (mAb) synthesis in mammalian cells. By using SV40pA variants of different strengths, the expression of light chain (LC) and heavy chain (HC) genes encoded in a single vector is independently altered which results in different ratios of LC to HC expression spanning a range from 0.24 to 16.42. The changes in gene expression are determined by measuring mRNA levels and intracellular LC and HC polypeptides. It is found that a substantial decrease of HC expression, which increases the LC/HC mRNA ratio, only slightly reduces mAb production. However, reducing the LC expression by a similar magnitude, which decreases the LC/HC mRNA ratio results in a sharp decline of mAb production to trace amounts. This set of SV40pA variants offers a new tool for accurate control of the relative expression levels of multiple genes. It will have wide-ranging applications in fields related to the study of biosynthesis of multi-subunit proteins, proteomic research on protein interactions, and multi-gene metabolic engineering.  相似文献   

4.
Several bioinformatics methods have been proposed for the detection and characterization of genomic structural variation (SV) from ultra high-throughput genome resequencing data. Recent surveys show that comprehensive detection of SV events of different types between an individual resequenced genome and a reference sequence is best achieved through the combination of methods based on different principles (split mapping, reassembly, read depth, insert size, etc.). The improvement of individual predictors is thus an important objective. In this study, we propose a new method that combines deviations from expected library insert sizes and additional information from local patterns of read mapping and uses supervised learning to predict the position and nature of structural variants. We show that our approach provides greatly increased sensitivity with respect to other tools based on paired end read mapping at no cost in specificity, and it makes reliable predictions of very short insertions and deletions in repetitive and low-complexity genomic contexts that can confound tools based on split mapping of reads.  相似文献   

5.
The development of any organism is a complex dynamic process that is controlled by a network of genes as well as by environmental factors. Traditional mapping approaches for analysing phenotypic data measured at a single time point are too simple to reveal the genetic control of developmental processes. A general statistical mapping framework, called functional mapping, has been proposed to characterize, in a single step, the quantitative trait loci (QTLs) or nucleotides (QTNs) that underlie a complex dynamic trait. Functional mapping estimates mathematical parameters that describe the developmental mechanisms of trait formation and expression for each QTL or QTN. The approach provides a useful quantitative and testable framework for assessing the interplay between gene actions or interactions and developmental changes.  相似文献   

6.
以单核苷酸多态性(Single-nucleotide polymorphism, SNP)为遗传标记, 采用全基因组关联研究(Genome-wide association studies, GWAS)的策略, 已经在660多种疾病(或性状)中发现了3800多个遗传易感基因区域。但是, 其中最显著关联的遗传变异或致病性的遗传变异位点及其生物学功能并不完全清楚。这些位点的鉴定有助于阐明复杂疾病的生物学机制, 以及发现新的疾病标记物。后GWAS时代的主要任务之一就是通过精细定位研究找到复杂疾病易感基因区域内最显著关联的易感位点或致病性的易感位点并阐明其生物学功能。针对常见变异, 可通过推断或重测序增加SNP密度, 寻找最显著关联的SNP位点, 并通过功能元件分析、表达数量性状位点(Expression quantitative trait locus, eQTL)分析和单体型分析等方法寻找功能性的SNP位点和易感基因。针对罕见变异, 则可采用重测序、罕见单体型分析、家系分析和负荷检验等方法进行精细定位。文章对这些策略和所面临的问题进行了综述。  相似文献   

7.
J Bernúes  R Beltrán  F Azorín 《Gene》1991,108(2):269-274
Repetitive d(CT.GA)n sequences are commonly found in eukaryotic genomic DNA. They are frequently located in sites involved in genetic recombination or in promoter regions. To test for their possible biological function, a d(CT.GA)22 synthetic sequence was introduced into the genome of SV40, since it constitutes an appropriate model system for eukaryotic chromatin. When SV40 infects permissive cells, it proliferates in the form of a minichromosome. The simple repetitive sequence indicated above was inserted at the unique HpaII site of SV40 (at nt 346), and the genomic stability of SV40 recombinants carrying the d(CT.GA)22 sequence (SV/CT22 viruses) was analyzed. Upon serial passage through permissive CV1 cells, SV/CT22 recombinants show an increased production of defective viruses. Generation of SV/CT22 variants is likely to take place via recombination between and within viral molecules. The enhancement of the rate of recombination induced by the repetitive sequence is likely to be related to its known propensity to form triple-stranded structures. Many different variants coexist in the same viral population indicating that the mechanism by which they are produced is not unique. One variant (SV/X), showing a replicative advantage, was characterized in detail. Variant SV/X accounts for a large proportion of the total viral population. Its genomic organization corresponds to a tandem duplication of an early SV40 DNA fragment spanning from approx. nt 3200-nt 160. Variant SV/X contains a duplicated SV40 ori.  相似文献   

8.
Next-generation sequencing (NGS) technologies have revolutionised the analysis of genomic structural variants (SVs), providing significant insights into SV de novo formation based on analyses of rearrangement breakpoint junctions. The short DNA reads generated by NGS, however, have also created novel obstacles by biasing the ascertainment of SVs, an aspect that we refer to as the 'short-read dilemma'. For example, recent studies have found that SVs are often complex, with SV formation generating large numbers of breakpoints in a single event (multi-breakpoint SVs) or structurally polymorphic loci having multiple allelic states (multi-allelic SVs). This complexity may be obscured in short reads, unless the data is analysed and interpreted within its wider genomic context. We discuss how novel approaches will help to overcome the short-read dilemma, and how integration of other sources of information, including the structure of chromatin, may help in the future to deepen the understanding of SV formation processes.  相似文献   

9.
《Genomics》2023,115(2):110568
It has recently been shown that structural variants (SV) can have a higher impact on gene expression variation compared to single nucleotide variants (SNV) in different plant species. Additionally, SV were associated with phenotypic variation in several crops. However, compared to the established SV detection based on short-read sequencing, less approaches were described for linked-read based SV calling. We therefore evaluated the performance of six linked-read SV callers compared to an established short-read SV caller based on simulated linked-reads in tetraploid potato. The objectives of our study were to i) compare the performance of SV callers based on linked-read sequencing to short-read sequencing, ii) examine the influence of SV type, SV length, haplotype incidence (HI), as well as sequencing coverage on the SV calling performance in the tetraploid potato genome, and iii) evaluate the accuracy of detecting insertions by linked-read compared to short-read sequencing. We observed high break point resolutions (BPR) detecting short SV and slightly lower BPR for large SV. Our observations highlighted the importance of short-read signals provided by Manta and LinkedSV to detect short SV. Manta and NAIBR performed well for detecting larger deletions, inversions, and duplications. Detected large SV were weakly influenced by the HI. Furthermore, we illustrated that large insertions can be assembled by Novel-X. Our results suggest the usage of the short-read and linked-read SV callers Manta, NAIBR, LinkedSV, and Novel-X based on at least 90x linked-read sequencing coverage to ensure the detection of a broad range of SV in the tetraploid potato genome.  相似文献   

10.
11.
12.
Natural populations of the fruit fly, Drosophila melanogaster, segregate genetic variation that leads to cardiac disease phenotypes. One nearly isogenic line from a North Carolina peach orchard, WE70, is shown to harbor two genetically distinct heart phenotypes: elevated incidence of arrhythmias, and a dramatically constricted heart diameter in both diastole and systole, with resemblance to restrictive cardiomyopathy in humans. Assuming the source to be rare variants of large effect, we performed Bulked Segregant Analysis using genomic DNA hybridization to Affymetrix chips to detect single feature polymorphisms, but found that the mutant phenotypes are more likely to have a polygenic basis. Further mapping efforts revealed a complex architecture wherein the constricted cardiomyopathy phenotype was observed in individual whole chromosome substitution lines, implying that variants on both major autosomes are sufficient to produce the phenotype. A panel of 170 Recombinant Inbred Lines (RIL) was generated, and a small subset of mutant lines selected, but these each complemented both whole chromosome substitutions, implying a non-additive (epistatic) contribution to the “disease” phenotype. Low coverage whole genome sequencing was also used to attempt to map chromosomal regions contributing to both the cardiomyopathy and arrhythmia, but a polygenic architecture had to be again inferred to be most likely. These results show that an apparently simple rare phenotype can have a complex genetic basis that would be refractory to mapping by deep sequencing in pedigrees. We present this as a cautionary tale regarding assumptions related to attempts to map new disease mutations on the assumption that probands carry a single causal mutation.  相似文献   

13.
Photosensitive reflex epilepsy is caused by the combination of an individual's enhanced sensitivity with relevant light stimuli, such as stroboscopic lights or video games. This is the most common reflex epilepsy in humans; it is characterized by the photoparoxysmal response, which is an abnormal electroencephalographic reaction, and seizures triggered by intermittent light stimulation. Here, by using genetic mapping, sequencing and functional analyses, we report that a mutation in the acceptor site of the second intron of SV2A (the gene encoding synaptic vesicle glycoprotein 2A) is causing photosensitive reflex epilepsy in a unique vertebrate model, the Fepi chicken strain, a spontaneous model where the neurological disorder is inherited as an autosomal recessive mutation. This mutation causes an aberrant splicing event and significantly reduces the level of SV2A mRNA in homozygous carriers. Levetiracetam, a second generation antiepileptic drug, is known to bind SV2A, and SV2A knock-out mice develop seizures soon after birth and usually die within three weeks. The Fepi chicken survives to adulthood and responds to levetiracetam, suggesting that the low-level expression of SV2A in these animals is sufficient to allow survival, but does not protect against seizures. Thus, the Fepi chicken model shows that the role of the SV2A pathway in the brain is conserved between birds and mammals, in spite of a large phylogenetic distance. The Fepi model appears particularly useful for further studies of physiopathology of reflex epilepsy, in comparison with induced models of epilepsy in rodents. Consequently, SV2A is a very attractive candidate gene for analysis in the context of both mono- and polygenic generalized epilepsies in humans.  相似文献   

14.
The importance of structural variants (SVs) for human phenotypes and diseases is now recognized. Although a variety of SV detection platforms and strategies that vary in sensitivity and specificity have been developed, few benchmarking procedures are available to confidently assess their performances in biological and clinical research. To facilitate the validation and application of these SV detection approaches, we established an Asian reference material by characterizing the genome of an Epstein-Barr virus (EBV)-immortalized B lymphocyte line along with identified benchmark regions and high-confidence SV calls. We established a high-confidence SV callset with 8938 SVs by integrating four alignment-based SV callers, including 109× Pacific Biosciences (PacBio) continuous long reads (CLRs), 22× PacBio circular consensus sequencing (CCS) reads, 104× Oxford Nanopore Technologies (ONT) long reads, and 114× Bionano optical mapping platform, and one de novo assembly-based SV caller using CCS reads. A total of 544 randomly selected SVs were validated by PCR amplification and Sanger sequencing, demonstrating the robustness of our SV calls. Combining trio-binning-based haplotype assemblies, we established an SV benchmark for identifying false negatives and false positives by constructing the continuous high-confidence regions (CHCRs), which covered 1.46 gigabase pairs (Gb) and 6882 SVs supported by at least one diploid haplotype assembly. Establishing high-confidence SV calls for a benchmark sample that has been characterized by multiple technologies provides a valuable resource for investigating SVs in human biology, disease, and clinical research.  相似文献   

15.
16.
Three simian virus (SV40)-phi X174 recombinant genomes were isolated from single BSC-1 monkey cells cotransfected with SV40 and phi X174 RF1 DNAs. The individual cell progenies were amplified, cloned, and mapped by a combination of restriction endonuclease and heteroduplex analyses. In each case, the 600 to 1,000 base pairs of phi X174 DNA (derived from different regions of the phi X174 genome) were present as single inserts, located in either the early or late SV40 regions; the deletion of SV40 DNA was greater than the size of the insert; and the remaining portions of the hybrid genome were indistinguishable from wild-type SV40 DNA, as judged by both mapping and biological tests. Hence, apart from the deletion which accommodates the phi X174 DNA insert, no other rearrangements of SV40 DNA were detected. The restriction map of a SV40-phi X174 recombinant DNA isolate before molecular cloning was indistinguishable from those of two separate cloned derivatives of that isolate, indicating that the species cloned was the major amplifiable recombinant structure generated by a single recombinant-producing cell. The relative simplicity of the SV40-phi X174 recombinant DNA examined is consistent with the notion that most recombinant-producing BSC-1 cells support single recombination events generating only one amplifiable recombinant structure.  相似文献   

17.
Prominin-1, a heavily glycosylated pentaspan membrane protein, is mainly known for its function as a marker for (cancer) stem cells, although it can also be detected on differentiated cells. Mouse prominin-1 expression is heavily regulated by splicing in eight different variants. The function or the expression pattern of prominin-1 and its splice variants (SVs) is thus far unknown. In this study, we analyzed the expression of the prominin-1 splice variants on mRNA level in several mouse tissues and found a broad tissue expression of the majority of SVs, but a specific set of SVs had a much more restricted expression profile. For instance, the testis expressed only SV3 and SV7. Moreover, SV8 was solely detected in the eye. Intriguingly, prominin-1 knockout mice do not suffer from gross abnormalities, but do show signs of blindness, which suggest that SV8 has a specific function in this tissue. In addition, databases searches for putative promoter regions in the mouse prominin-1 gene revealed three potential promoter regions that could be linked to specific SVs. Interestingly, for both SV7 and SV8, a specific potential promoter region could be identified. To conclude, the majority of mouse prominin-1 splice variants are widely expressed in mouse tissues. However, specific expression of a few variants, likely driven by specific promoters, suggests distinct regulation and a potential important function for these variants in certain tissues.  相似文献   

18.
The occurrence of somaclonal variation among regenerants derived through indirect shoot organogenesis from leaf explants of three Dieffenbachia cultivars Camouflage, Camille and Star Bright was evaluated. Three types of somaclonal variants (SV1, SV2, and SV3) were identified from regenerated plants of cv. Camouflage, one type from cv. Camille, but none from cv. Star Bright. The three variants had novel and distinct foliar variegation patterns compared to cv. Camouflage parental plants. Additionally, SV1 was taller with a larger canopy and longer leaves than parental plants and SV2. SV2 and SV3 did not produce basal shoots (single stem) but basal shoot numbers between SV1 and parental plants were similar ranging from three to four. The variant type identified from regenerated cv. Camille had lanceolate leaves compared to the oblong leaves of the parent. This variant type also grew taller and had a larger canopy than parental plants. The rates of somaclonal variation were up to 40.4% among regenerated cv. Camouflage plants and 2.6% for regenerated cv. Camille. The duration of callus culture had no effect on somaclonal variation rates of cv. Camouflage as the rates between plants regenerated from 8 months to 16 months of callus culture were similar. The phenotypes of the identified variants were stable as verified by their progenies after cutting propagation. This study demonstrated the potential for new cultivar development by selecting callus-derived somaclonal variants of Dieffenbachia.  相似文献   

19.
N-glycolyl GM1 ganglioside as a receptor for simian virus 40   总被引:1,自引:0,他引:1       下载免费PDF全文
Carbohydrate microarrays have emerged as powerful tools in analyses of microbe-host interactions. Using a microarray with 190 sequence-defined oligosaccharides in the form of natural glycolipids and neoglycolipids representative of diverse mammalian glycans, we examined interactions of simian virus 40 (SV40) with potential carbohydrate receptors. While the results confirmed the high specificity of SV40 for the ganglioside GM1, they also revealed that N-glycolyl GM1 ganglioside [GM1(Gc)], which is characteristic of simian species and many other nonhuman mammals, is a better ligand than the N-acetyl analog [GM1(Ac)] found in mammals, including humans. After supplementing glycolipid-deficient GM95 cells with GM1(Ac) and GM1(Gc) gangliosides and the corresponding neoglycolipids with phosphatidylethanolamine lipid groups, it was found that GM1(Gc) analogs conferred better virus binding and infectivity. Moreover, we visualized the interaction of NeuGc with VP1 protein of SV40 by molecular modeling and identified a conformation for GM1(Gc) ganglioside in complex with the virus VP1 pentamer that is compatible with its presentation as a membrane receptor. Our results open the way not only to detailed studies of SV40 infection in relation to receptor expression in host cells but also to the monitoring of changes that may occur with time in receptor usage by the virus.  相似文献   

20.
The epigenome is defined as a type of information that can be transmitted independently of the DNA sequence, at the chromatin level, through post-translational modifications present on histone tails. Recent advances in the identification of histone 3 variants suggest a new model of information transmission through deposition of specific histone variants. To date, several non-centromeric histone 3 variants have been identified in mammals. Despite protein sequence similarity, specific deposition complexes have been characterized for both histone 3.1 (H3.1) and histone 3.3 (H3.3), whereas no deposition complex for histone 3.2 (H3.2) has been identified to date. Here, we identified human H3.2 partners by immunopurification of nuclear H3.2 complexes followed by mass spectrometry analysis. Further biochemical analyses highlighted two major complexes associated with H3.2, one containing chromatin associated factor-1 subunits and the other consisting of a subcomplex of mini chromosome maintenance helicases, together with Asf1. The purified complexes could associate with a DNA template in vitro.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号