首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Brandström M  Ellegren H 《Genetics》2007,176(3):1691-1701
It is increasingly recognized that insertions and deletions (indels) are an important source of genetic as well as phenotypic divergence and diversity. We analyzed length polymorphisms identified through partial (0.25x) shotgun sequencing of three breeds of domestic chicken made by the International Chicken Polymorphism Map Consortium. A data set of 140,484 short indel polymorphisms in unique DNA was identified after filtering for microsatellite structures. There was a significant excess of tandem duplicates at indel sites, with deletions of a duplicate motif outnumbering the generation of duplicates through insertion. Indel density was lower in microchromosomes than in macrochromosomes, in the Z chromosome than in autosomes, and in 100 bp of upstream sequence, 5'-UTR, and first introns than in intergenic DNA and in other introns. Indel density was highly correlated with single nucleotide polymorphism (SNP) density. The mean density of indels in pairwise sequence comparisons was 1.9 x 10(-4) indel events/bp, approximately 5% the density of SNPs segregating in the chicken genome. The great majority of indels involved a limited number of nucleotides (median 1 bp), with A-rich motifs being overrepresented at indel sites. The overrepresentation of deletions at tandem duplicates indicates that replication slippage in duplicate sequences is a common mechanism behind indel mutation. The correlation between indel and SNP density indicates common effects of mutation and/or selection on the occurrence of indels and point mutations.  相似文献   

2.
Insertions and deletions (indels) are important types of structural variations. Obtaining accurate genotypes of indels may facilitate further genetic study. There are a few existing methods for calling indel genotypes from sequence reads. However, none of these tools can accurately call indel genotypes for indels of all lengths, especially for low coverage sequence data. In this paper, we present GINDEL, an approach for calling genotypes of both insertions and deletions from sequence reads. GINDEL uses a machine learning approach which combines multiple features extracted from next generation sequencing data. We test our approach on both simulated and real data and compare with existing tools, including Genome STRiP, Pindel and Clever-sv. Results show that GINDEL works well for deletions larger than 50 bp on both high and low coverage data. Also, GINDEL performs well for insertion genotyping on both simulated and real data. For comparison, Genome STRiP performs less well for shorter deletions (50–200 bp) on both simulated and real sequence data from the 1000 Genomes Project. Clever-sv performs well for intermediate deletions (200–1500 bp) but is less accurate when coverage is low. Pindel only works well for high coverage data, but does not perform well at low coverage. To summarize, we show that GINDEL not only can call genotypes of insertions and deletions (both short and long) for high and low coverage population sequence data, but also is more accurate and efficient than other approaches. The program GINDEL can be downloaded at: http://sourceforge.net/p/gindel  相似文献   

3.
Structural variation (SV) is a significant component of the genetic etiology of both neurodevelopmental and psychiatric disorders; however, routine guidelines for clinical genetic screening have been established only in the former category. Genome-wide chromosomal microarray (CMA) can detect genomic imbalances such as copy-number variants (CNVs), but balanced chromosomal abnormalities (BCAs) still require karyotyping for clinical detection. Moreover, submicroscopic BCAs and subarray threshold CNVs are intractable, or cryptic, to both CMA and karyotyping. Here, we performed whole-genome sequencing using large-insert jumping libraries to delineate both cytogenetically visible and cryptic SVs in a single test among 30 clinically referred youth representing a range of severe neuropsychiatric conditions. We detected 96 SVs per person on average that passed filtering criteria above our highest-confidence resolution (6,305 bp) and an additional 111 SVs per genome below this resolution. These SVs rearranged 3.8 Mb of genomic sequence and resulted in 42 putative loss-of-function (LoF) or gain-of-function mutations per person. We estimate that 80% of the LoF variants were cryptic to clinical CMA. We found myriad complex and cryptic rearrangements, including a “paired” duplication (360 kb, 169 kb) that flanks a 5.25 Mb inversion that appears in 7 additional cases from clinical CNV data among 47,562 individuals. Following convergent genomic profiling of these independent clinical CNV data, we interpreted three SVs to be of potential clinical significance. These data indicate that sequence-based delineation of the full SV mutational spectrum warrants exploration in youth referred for neuropsychiatric evaluation and clinical diagnostic SV screening more broadly.  相似文献   

4.
5.
Structural variants (SVs) are a largely unstudied feature of plant genome evolution, despite the fact that SVs contribute substantially to phenotypes. In this study, we discovered SVs across a population sample of 347 high-coverage, resequenced genomes of Asian rice (Oryza sativa) and its wild ancestor (O. rufipogon). In addition to this short-read data set, we also inferred SVs from whole-genome assemblies and long-read data. Comparisons among data sets revealed different features of genome variability. For example, genome alignment identified a large (∼4.3 Mb) inversion in indica rice varieties relative to japonica varieties, and long-read analyses suggest that ∼9% of genes from the outgroup (O. longistaminata) are hemizygous. We focused, however, on the resequencing sample to investigate the population genomics of SVs. Clustering analyses with SVs recapitulated the rice cultivar groups that were also inferred from SNPs. However, the site-frequency spectrum of each SV type—which included inversions, duplications, deletions, translocations, and mobile element insertions—was skewed toward lower frequency variants than synonymous SNPs, suggesting that SVs may be predominantly deleterious. Among transposable elements, SINE and mariner insertions were found at especially low frequency. We also used SVs to study domestication by contrasting between rice and O. rufipogon. Cultivated genomes contained ∼25% more derived SVs and mobile element insertions than O. rufipogon, indicating that SVs contribute to the cost of domestication in rice. Peaks of SV divergence were enriched for known domestication genes, but we also detected hundreds of genes gained and lost during domestication, some of which were enriched for traits of agronomic interest.  相似文献   

6.
The APC gene is a putative human tumor-suppressor gene responsible for adenomatous polyposis coli (APC), an inherited, autosomal dominant predisposition to colon cancer. It is also implicated in the development of sporadic colorectal tumors. The characterization of APC gene mutations in APC patients is clinically important because DNA-based tests can be applied for presymptomatic diagnosis once a specific mutation has been identified in a family. Moreover, the identification of the spectrum of APC gene mutations in patients is of great interest in the study of the biological properties of the APC gene product. We analyzed the entire coding region of the APC gene by the PCR–single-strand conformation polymorphism method in 42 unrelated Italian APC patients. Mutations were found in 12 cases. These consist of small (5–14 bp) base-pair deletions leading to frameshifts; all are localized within exon 15. Two of these deletions, a 5-bp deletion at position 3183–3187 and a 5-bp deletion at position 3926–3930, are present in 3/42 and 7/42 cases of our series, respectively, indicating the presence of mutational hot spots at these two sites.  相似文献   

7.
8.
Most insertions or deletions generated by CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats/CRISPR-associated protein 9) endonucleases are short (<25 bp), but unpredictable on-target long DNA deletions (>500 bp) can be observed. The possibility of generating long on-target DNA deletions poses safety risks to somatic genome editing and makes the outcomes of genome editing less predictable. Methods for generating refined mutations are desirable but currently unavailable. Here, we show that fusing Escherichia coli DNA polymerase I or the Klenow fragment to Cas9 greatly increases the frequencies of 1-bp deletions and decreases >1-bp deletions or insertions. Importantly, doing so also greatly decreases the generation of long deletions, including those >2 kb. In addition, templated insertions (the insertion of the nucleotide 4 nt upstream of the protospacer adjacent motif) were increased relative to other insertions. Counteracting DNA resection was one of the mechanisms perturbing deletion sizes. Targeting DNA polymerase to double-strand breaks did not increase off-targets or base substitution rates around the cleavage sites, yet increased editing efficiency in primary cells. Our strategy makes it possible to generate refined DNA mutations for improved safety without sacrificing efficiency of genome editing.  相似文献   

9.
The combined analysis of haplotype panels with phenotype clinical cohorts is a common approach to explore the genetic architecture of human diseases. However, genetic studies are mainly based on single nucleotide variants (SNVs) and small insertions and deletions (indels). Here, we contribute to fill this gap by generating a dense haplotype map focused on the identification, characterization, and phasing of structural variants (SVs). By integrating multiple variant identification methods and Logistic Regression Models (LRMs), we present a catalogue of 35 431 441 variants, including 89 178 SVs (≥50 bp), 30 325 064 SNVs and 5 017 199 indels, across 785 Illumina high coverage (30x) whole-genomes from the Iberian GCAT Cohort, containing a median of 3.52M SNVs, 606 336 indels and 6393 SVs per individual. The haplotype panel is able to impute up to 14 360 728 SNVs/indels and 23 179 SVs, showing a 2.7-fold increase for SVs compared with available genetic variation panels. The value of this panel for SVs analysis is shown through an imputed rare Alu element located in a new locus associated with Mononeuritis of lower limb, a rare neuromuscular disease. This study represents the first deep characterization of genetic variation within the Iberian population and the first operational haplotype panel to systematically include the SVs into genome-wide genetic studies.  相似文献   

10.
Nucleotide substitution, insertion and deletion (indel) events are the major driving forces that have shaped genomes. Using the recently identified human ribosomal protein (RP) pseudogene sequences, we have thoroughly studied DNA mutation patterns in the human genome. We analyzed a total of 1726 processed RP pseudogene sequences, comprising more than 700 000 bases. To be sure to differentiate the sequence changes occurring in the functional genes during evolution from those occurring in pseudogenes after they were fixed in the genome, we used only pseudogene sequences originating from parts of RP genes that are identical in human and mouse. Overall, we found that nucleotide transitions are more common than transversions, by roughly a factor of two. Moreover, the substitution rates amongst the 12 possible nucleotide pairs are not homogeneous as they are affected by the type of immediately neighboring nucleotides and the overall local G+C content. Finally, our dataset is large enough that it has many indels, thus allowing for the first time statistically robust analysis of these events. Overall, we found that deletions are about three times more common than insertions (3740 versus 1291). The frequencies of both these events follow characteristic power–law behavior associated with the size of the indel. However, unexpectedly, the frequency of 3 bp deletions (in contrast to 3 bp insertions) violates this trend, being considerably higher than that of 2 bp deletions. The possible biological implications of such a 3 bp bias are discussed.  相似文献   

11.
12.
The use of whole-genome microarrays for monitoring mutagenized or otherwise engineered genetic derivatives is a potentially powerful tool for checking genomic integrity. Using comparative genomic hybridization of a number of unrelated, directed deletion mutants in Escherichia coli K-12 MG1655, we identified unintended secondary genomic deletions in the flhDC region in Δfnr, Δcrp, and ΔcreB mutants. These deletions were confirmed by PCR and phenotypic tests. Our findings show that nonmotile progeny are found in some MG1655 directed deletion mutants, and studies on the effects of gene knockouts should be viewed with caution when the mutants have not been screened for the presence of secondary deletions or confirmed by other methods.  相似文献   

13.
14.
Accurate estimates of mutation rates provide critical information to analyze genome evolution and organism fitness. We used whole-genome DNA sequencing, pulse-field gel electrophoresis, and comparative genome hybridization to determine mutation rates in diploid vegetative and meiotic mutation accumulation lines of Saccharomyces cerevisiae. The vegetative lines underwent only mitotic divisions while the meiotic lines underwent a meiotic cycle every ∼20 vegetative divisions. Similar base substitution rates were estimated for both lines. Given our experimental design, these measures indicated that the meiotic mutation rate is within the range of being equal to zero to being 55-fold higher than the vegetative rate. Mutations detected in vegetative lines were all heterozygous while those in meiotic lines were homozygous. A quantitative analysis of intra-tetrad mating events in the meiotic lines showed that inter-spore mating is primarily responsible for rapidly fixing mutations to homozygosity as well as for removing mutations. We did not observe 1–2 nt insertion/deletion (in-del) mutations in any of the sequenced lines and only one structural variant in a non-telomeric location was found. However, a large number of structural variations in subtelomeric sequences were seen in both vegetative and meiotic lines that did not affect viability. Our results indicate that the diploid yeast nuclear genome is remarkably stable during the vegetative and meiotic cell cycles and support the hypothesis that peripheral regions of chromosomes are more dynamic than gene-rich central sections where structural rearrangements could be deleterious. This work also provides an improved estimate for the mutational load carried by diploid organisms.  相似文献   

15.
The recent FDA approval of the MiSeqDx platform provides a unique opportunity to develop targeted next generation sequencing (NGS) panels for human disease, including cancer. We have developed a scalable, targeted panel-based assay termed UNCseq, which involves a NGS panel of over 200 cancer-associated genes and a standardized downstream bioinformatics pipeline for detection of single nucleotide variations (SNV) as well as small insertions and deletions (indel). In addition, we developed a novel algorithm, NGScopy, designed for samples with sparse sequencing coverage to detect large-scale copy number variations (CNV), similar to human SNP Array 6.0 as well as small-scale intragenic CNV. Overall, we applied this assay to 100 snap-frozen lung cancer specimens lacking same-patient germline DNA (07–0120 tissue cohort) and validated our results against Sanger sequencing, SNP Array, and our recently published integrated DNA-seq/RNA-seq assay, UNCqeR, where RNA-seq of same-patient tumor specimens confirmed SNV detected by DNA-seq, if RNA-seq coverage depth was adequate. In addition, we applied the UNCseq assay on an independent lung cancer tumor tissue collection with available same-patient germline DNA (11–1115 tissue cohort) and confirmed mutations using assays performed in a CLIA-certified laboratory. We conclude that UNCseq can identify SNV, indel, and CNV in tumor specimens lacking germline DNA in a cost-efficient fashion.  相似文献   

16.
Mitochondrial DNA (mtDNA) variants are widely used in evolutionary genetics as markers for population history and to estimate divergence times among taxa. Inferences of species history are generally based on phylogenetic comparisons, which assume that molecular evolution is clock-like. Between-species comparisons have also been used to estimate the mutation rate, using sites that are thought to evolve neutrally. We directly estimated the mtDNA mutation rate by scanning the mitochondrial genome of Drosophila melanogaster lines that had undergone approximately 200 generations of spontaneous mutation accumulation (MA). We detected a total of 28 point mutations and eight insertion-deletion (indel) mutations, yielding an estimate for the single-nucleotide mutation rate of 6.2 × 10−8 per site per fly generation. Most mutations were heteroplasmic within a line, and their frequency distribution suggests that the effective number of mitochondrial genomes transmitted per female per generation is about 30. We observed repeated occurrences of some indel mutations, suggesting that indel mutational hotspots are common. Among the point mutations, there is a large excess of G→A mutations on the major strand (the sense strand for the majority of mitochondrial genes). These mutations tend to occur at nonsynonymous sites of protein-coding genes, and they are expected to be deleterious, so do not become fixed between species. The overall mtDNA mutation rate per base pair per fly generation in Drosophila is estimated to be about 10× higher than the nuclear mutation rate, but the mitochondrial major strand G→A mutation rate is about 70× higher than the nuclear rate. Silent sites are substantially more strongly biased towards A and T than nonsynonymous sites, consistent with the extreme mutation bias towards A+T. Strand-asymmetric mutation bias, coupled with selection to maintain specific nonsynonymous bases, therefore provides an explanation for the extreme base composition of the mitochondrial genome of Drosophila.  相似文献   

17.
The nuclease-based gene editing tools are rapidly transforming capabilities for altering the genome of cells and organisms with great precision and in high throughput studies. A major limitation in application of precise gene editing lies in lack of sensitive and fast methods to detect and characterize the induced DNA changes. Precise gene editing induces double-stranded DNA breaks that are repaired by error-prone non-homologous end joining leading to introduction of insertions and deletions (indels) at the target site. These indels are often small and difficult and laborious to detect by traditional methods. Here we present a method for fast, sensitive and simple indel detection that accurately defines indel sizes down to ±1 bp. The method coined IDAA for Indel Detection by Amplicon Analysis is based on tri-primer amplicon labelling and DNA capillary electrophoresis detection, and IDAA is amenable for high throughput analysis.  相似文献   

18.
19.
Chan SH  Bao Y  Ciszak E  Laget S  Xu SY 《Nucleic acids research》2007,35(18):6238-6248
Creating endonucleases with novel sequence specificities provides more possibilities to manipulate DNA. We have created a chimeric endonuclease (CH-endonuclease) consisting of the DNA cleavage domain of BmrI restriction endonuclease and C.BclI, a controller protein of the BclI restriction-modification system. The purified chimeric endonuclease, BmrI198-C.BclI, cleaves DNA at specific sites in the vicinity of the recognition sequence of C.BclI. Double-strand (ds) breaks were observed at two sites: 8 bp upstream and 18 bp within the C-box sequence. Using DNA substrates with deletions of C-box sequence, we show that the chimeric endonuclease requires the 5′ half of the C box only for specific cleavage. A schematic model is proposed for the mode of protein–DNA binding and DNA cleavage. The present study demonstrates that the BmrI cleavage domain can be used to create combinatorial endonucleases that cleave DNA at specific sequences dictated by the DNA-binding partner. The resulting endonucleases will be useful in vitro and in vivo to create ds breaks at specific sites and generate deletions.  相似文献   

20.
Many questions regarding the initiation of replication and translation of the segmented, double-stranded RNA genome of infectious bursal disease virus (IBDV) remain to be solved. Computer analysis shows that the non-polyadenylated extreme 3′-untranslated regions (UTRs) of the coding strand of both genomic segments are able to fold into a single stem–loop structure. To assess the determinants for a functional 3′-UTR, we mutagenized the 3′-UTR stem–loop structure of the B-segment. Rescue of infectious virus from mutagenized cDNA plasmids was impaired in all cases. However, after one passage, the replication kinetics of these viruses were restored. Sequence analysis revealed that additional mutations had been acquired in most of the stem–loop structures, which compensated the introduced ones. A rescued virus with a modified stem–loop structure containing four nucleotide substitutions, but preserving its overall secondary structure, was phenotypically indistinguishable from wild-type virus, both in vitro (cell culture) and in vivo (chickens, natural host). Sequence analysis showed that the modified stem–loop structure of this virus was fully preserved after four serial passages. Apparently, it is the stem–loop structure and not the primary sequence that is the functional determinant in the 3′-UTRs of IBDV.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号