首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
DNA methylation, an important type of epigenetic modification in humans, participates in crucial cellular processes, such as embryonic development, X-inactivation, genomic imprinting and chromosome stability. Several platforms have been developed to study genome-wide DNA methylation. Many investigators in the field have chosen the Illumina Infinium HumanMethylation microarray for its ability to reliably assess DNA methylation following sodium bisulfite conversion. Here, we analyzed methylation profiles of 489 adult males and 357 adult females generated by the Infinium HumanMethylation450 microarray. Among the autosomal CpG sites that displayed significant methylation differences between the two sexes, we observed a significant enrichment of cross-reactive probes co-hybridizing to the sex chromosomes with more than 94% sequence identity. This could lead investigators to mistakenly infer the existence of significant autosomal sex-associated methylation. Using sequence identity cutoffs derived from the sex methylation analysis, we concluded that 6% of the array probes can potentially generate spurious signals because of co-hybridization to alternate genomic sequences highly homologous to the intended targets. Additionally, we discovered probes targeting polymorphic CpGs that overlapped SNPs. The methylation levels detected by these probes are simply the reflection of underlying genetic polymorphisms but could be misinterpreted as true signals. The existence of probes that are cross-reactive or of target polymorphic CpGs in the Illumina HumanMethylation microarrays can confound data obtained from such microarrays. Therefore, investigators should exercise caution when significant biological associations are found using these array platforms. A list of all cross-reactive probes and polymorphic CpGs identified by us are annotated in this paper.  相似文献   

3.
Structural variants (SVs) represent an important genetic resource for both natural and artificial selection. Here we present a chromosome-scale reference genome for domestic yak (Bos grunniens) that has longer contigs and scaffolds (N50 44.72 and 114.39 Mb, respectively) than reported for any other ruminant genome. We further obtained long-read resequencing data for 6 wild and 23 domestic yaks and constructed a genetic SV map of 372,220 SVs that covers the geographic range of the yaks. The majority of the SVs contains repetitive sequences and several are in or near genes. By comparing SVs in domestic and wild yaks, we identified genes that are predominantly related to the nervous system, behavior, immunity, and reproduction and may have been targeted by artificial selection during yak domestication. These findings provide new insights in the domestication of animals living at high altitude and highlight the importance of SVs in animal domestication.  相似文献   

4.
Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs.  相似文献   

5.
The disease caused by the apicomplexan protozoan parasite Theileria parva, known as East Coast fever or Corridor disease, is one of the most serious cattle diseases in Eastern, Central, and Southern Africa. We performed whole-genome sequencing of nine T. parva strains, including one of the vaccine strains (Kiambu 5), field isolates from Zambia, Uganda, Tanzania, or Rwanda, and two buffalo-derived strains. Comparison with the reference Muguga genome sequence revealed 34 814–121 545 single nucleotide polymorphisms (SNPs) that were more abundant in buffalo-derived strains. High-resolution phylogenetic trees were constructed with selected informative SNPs that allowed the investigation of possible complex recombination events among ancestors of the extant strains. We further analysed the dN/dS ratio (non-synonymous substitutions per non-synonymous site divided by synonymous substitutions per synonymous site) for 4011 coding genes to estimate potential selective pressure. Genes under possible positive selection were identified that may, in turn, assist in the identification of immunogenic proteins or vaccine candidates. This study elucidated the phylogeny of T. parva strains based on genome-wide SNPs analysis with prediction of possible past recombination events, providing insight into the migration, diversification, and evolution of this parasite species in the African continent.  相似文献   

6.
Daniel Gianola 《Genetics》2013,194(3):573-596
Whole-genome enabled prediction of complex traits has received enormous attention in animal and plant breeding and is making inroads into human and even Drosophila genetics. The term “Bayesian alphabet” denotes a growing number of letters of the alphabet used to denote various Bayesian linear regressions that differ in the priors adopted, while sharing the same sampling model. We explore the role of the prior distribution in whole-genome regression models for dissecting complex traits in what is now a standard situation with genomic data where the number of unknown parameters (p) typically exceeds sample size (n). Members of the alphabet aim to confront this overparameterization in various manners, but it is shown here that the prior is always influential, unless np. This happens because parameters are not likelihood identified, so Bayesian learning is imperfect. Since inferences are not devoid of the influence of the prior, claims about genetic architecture from these methods should be taken with caution. However, all such procedures may deliver reasonable predictions of complex traits, provided that some parameters (“tuning knobs”) are assessed via a properly conducted cross-validation. It is concluded that members of the alphabet have a room in whole-genome prediction of phenotypes, but have somewhat doubtful inferential value, at least when sample size is such that np.  相似文献   

7.
8.
Illumina's Genome Analyzer generates ultra-short sequence reads, typically 36 nucleotides in length, and is primarily intended for resequencing. We tested the potential of this technology for de novo sequence assembly on the 6 Mbp genome of Pseudomonas syringae pv. syringae B728a with several freely available assembly software packages. Using an unpaired data set, velvet assembled >96% of the genome into contigs with an N50 length of 8289 nucleotides and an error rate of 0.33%. edena generated smaller contigs (N50 was 4192 nucleotides) and comparable error rates. ssake and vcake yielded shorter contigs with very high error rates. Assembly of paired-end sequence data carrying 400 bp inserts produced longer contigs (N50 up to 15 628 nucleotides), but with increased error rates (0.5%). Contig length and error rate were very sensitive to the choice of parameter values. Noncoding RNA genes were poorly resolved in de novo assemblies, while >90% of the protein-coding genes were assembled with 100% accuracy over their full length. This study demonstrates that, in practice, de novo assembly of 36-nucleotide reads can generate reasonably accurate assemblies from about 40 × deep sequence data sets. These draft assemblies are useful for exploring an organism's proteomic potential, at a very economic low cost.  相似文献   

9.
Multi-sample pooling and Illumina Genome Analyzer (GA) sequencing allows high throughput sequencing of multiple samples to determine population sequence variation. A preliminary experiment, using the RET proto-oncogene as a model, predicted ≤30 samples could be pooled to reliably detect singleton variants without requiring additional confirmation testing. This report used 30 and 50 sample pools to test the hypothesized pooling limit and also to test recent protocol improvements, Illumina GAIIx upgrades, and longer read chemistry. The SequalPrepTM method was used to normalize amplicons before pooling. For comparison, a single ‘control’ sample was run in a different flow cell lane. Data was evaluated by variant read percentages and the subtractive correction method which utilizes the control sample. In total, 59 variants were detected within the pooled samples, which included all 47 known true variants. The 15 known singleton variants due to Sanger sequencing had an average of 1.62±0.26% variant reads for the 30 pool (expected 1.67% for a singleton variant [unique variant within the pool]) and 1.01±0.19% for the 50 pool (expected 1%). The 76 base read lengths had higher error rates than shorter read lengths (33 and 50 base reads), which eliminated the distinction of true singleton variants from background error. This report demonstrated pooling limits from 30 up to 50 samples (depending on error rates and coverage), for reliable singleton variant detection. The presented pooling protocols and analysis methods can be used for variant discovery in other genes, facilitating molecular diagnostic test design and interpretation.  相似文献   

10.
We describe a method for linear isothermal DNA amplification using nicking endonuclease-mediated strand displacement by a DNA polymerase. The nicking of one strand of a DNA target by the endonuclease produces a primer for the polymerase to initiate synthesis. As the polymerization proceeds, the downstream strand is displaced into a single-stranded form while the nicking site is also regenerated. The combined continuous repetitive action of nicking by the endonuclease and strand-displacement synthesis by the polymerase results in linear amplification of one strand of the DNA molecule. We demonstrate that DNA templates up to 5000 nucleotides can be linearly amplified using a nicking endonuclease with 7-bp recognition sequence and Sequenase version 2.0 in the presence of single-stranded DNA binding proteins. We also show that a mixture of three templates of 500, 1000, and 5000 nucleotides in length is linearly amplified with the original molar ratios of the templates preserved. Moreover, we demonstrate that a complex library of hydrodynamically sheared genomic DNA from bacteriophage lambda can be amplified linearly.  相似文献   

11.
Paramecium has long been a model eukaryote. The sequence of the Paramecium tetraurelia genome reveals a history of three successive whole-genome duplications (WGDs), and the sequences of P. biaurelia and P. sexaurelia suggest that these WGDs are shared by all members of the aurelia species complex. Here, we present the genome sequence of P. caudatum, a species closely related to the P. aurelia species group. P. caudatum shares only the most ancient of the three WGDs with the aurelia complex. We found that P. caudatum maintains twice as many paralogs from this early event as the P. aurelia species, suggesting that post-WGD gene retention is influenced by subsequent WGDs and supporting the importance of selection for dosage in gene retention. The availability of P. caudatum as an outgroup allows an expanded analysis of the aurelia intermediate and recent WGD events. Both the Guanine+Cytosine (GC) content and the expression level of preduplication genes are significant predictors of duplicate retention. We find widespread asymmetrical evolution among aurelia paralogs, which is likely caused by gradual pseudogenization rather than by neofunctionalization. Finally, cases of divergent resolution of intermediate WGD duplicates between aurelia species implicate this process acts as an ongoing reinforcement mechanism of reproductive isolation long after a WGD event.  相似文献   

12.
The proper identification of differentially methylated CpGs is central in most epigenetic studies. The Illumina HumanMethylation450 BeadChip is widely used to quantify DNA methylation; nevertheless, the design of an appropriate analysis pipeline faces severe challenges due to the convolution of biological and technical variability and the presence of a signal bias between Infinium I and II probe design types. Despite recent attempts to investigate how to analyze DNA methylation data with such an array design, it has not been possible to perform a comprehensive comparison between different bioinformatics pipelines due to the lack of appropriate data sets having both large sample size and sufficient number of technical replicates. Here we perform such a comparative analysis, targeting the problems of reducing the technical variability, eliminating the probe design bias and reducing the batch effect by exploiting two unpublished data sets, which included technical replicates and were profiled for DNA methylation either on peripheral blood, monocytes or muscle biopsies. We evaluated the performance of different analysis pipelines and demonstrated that: (1) it is critical to correct for the probe design type, since the amplitude of the measured methylation change depends on the underlying chemistry; (2) the effect of different normalization schemes is mixed, and the most effective method in our hands were quantile normalization and Beta Mixture Quantile dilation (BMIQ); (3) it is beneficial to correct for batch effects. In conclusion, our comparative analysis using a comprehensive data set suggests an efficient pipeline for proper identification of differentially methylated CpGs using the Illumina 450K arrays.  相似文献   

13.
木根麦冬(Ophiopogon xylorrhizus)干叶提取DNA用于RAPD分析   总被引:3,自引:0,他引:3  
木根麦冬(Ophiopogonxylorrhizus)是我国珍稀濒危植物,分布仅限于云南西双版纳雨林,在植物系统学和保护生物学研究中具有独特的意义。随机扩增多态DNA(RAPD)方法是揭示群体遗传多样性的高效、简便方法,但一般均以新鲜材料提取总DNA,对一些分布边远地区物种难以采用此法。本文研究从木根麦冬干叶片中提取总DNA,进行RAPD分析。样品取自4个居群、49个个体。选取生长旺盛的叶片,在野外用硅胶快速干燥保存样品。采用高盐低pH值法提取总DNA,每克鲜重所得的干叶可得80~160μg。通过对模板DNA的各种处理和PCR扩增程序的调整,解决了扩增片段边缘弥散、界线模糊、产率低等问题,获得了理想的扩增带型。这一成果对其它从野外直接采样的干叶提取DNA进行RAPD研究具有指导意义  相似文献   

14.
张双华  孙源  张治洲 《生命科学》2013,(11):1135-1143
基因组合成是合成生物学的一个重要环节,其发展将会对未来的生物、医药、农业、能源等方面的发展产生巨大的推动作用,同时DNA拼接作为基因组合成所需要的一种关键技术也日臻完备。回顾了多种DNA拼接技术,并对其中最具发展潜力的几种方法作了综述,探讨不同DNA拼接技术的原理和特点。  相似文献   

15.
黄淑帧  王启松 《遗传学报》1989,16(6):475-482
本文报道应用DNA扩增技术对国内首例镰状细胞特征患者(Hb s杂合子)进行基因诊断。方法是从患者干血标本中微量抽提基因组DNA,通过聚合酶链反应(PCR)扩增其β珠蛋白基因,经限制性内切酶MstⅡ消化后作电泳分析直接检测Hb S基因。本文介绍的DNA诊断技术快速、灵敏、简便,它不需要放射性同位素标记的探针,可以采用干血抽提的DNA,因此,对遗传病基因诊断和携带者的筛查具有重要价值。  相似文献   

16.
A DNA-binding matrix was immobilized on the surface of a 96-well microplate and used for plasmid DNA preparation for DNA sequencing. The same DNA-binding plate was used for bacterial growth, cell lysis, DNA purification, and storage. In a single step using one buffer, bacterial cells were lysed by enzymes, and released DNA was captured on the plate simultaneously. After two wash steps, DNA was eluted and stored in the same plate. Inclusion of phosphates in the culture medium was found to enhance the yield of plasmid significantly. Purified DNA samples were used successfully in DNA sequencing with high consistency and reproducibility. Eleven vectors and nine libraries were tested using this method. In 10 μl sequencing reactions using 3 μl sample and 0.25 μl BigDye Terminator v3.1, the results from a 3730xl sequencer gave a success rate of 90–95% and read-lengths of 700 bases or more. The method is fully automatable and convenient for manual operation as well. It enables reproducible, high-throughput, rapid production of DNA with purity and yields sufficient for high-quality DNA sequencing at a substantially reduced cost.  相似文献   

17.
18.
19.
利用DNA池和测序技术快速筛查SNPs及估算基因频率   总被引:8,自引:0,他引:8  
选取产蛋性能具有明显差异的4个鸡品种(莱航鸡、阳山鸡、丝羽乌骨鸡和隐性白洛克鸡)构建品种DNA池,采用测序的方法研究鸡催乳素基因5′侧翼调控区远端序列(1028bp)的多态性,快速筛查到8个可能与产蛋性能相关的SNPs(C-2402T、T-2192C、C-2161G、C-2134G、C-2062G、G-2040A、A-1944G和C-1884A)。进一步利用测序图中SNP等位基因峰高的比值估算各鸡品种等位基因的频率,其中C-2402T、C-2161G、C-1884A和C-2062G、G-2040A位点等位基因频率的估算结果分别被PCR-RFLP、PCR-SSCP所验证,说明测序峰高比值估算等位基因频率的方法具有一定的可行性。  相似文献   

20.
Breakage-fusion-bridge cycles contribute to chromosome aberrations and generate large DNA palindromes that facilitate oncogene amplification in cancer cells. At the molecular level, large DNA palindrome formation is initiated by chromosome breaks, and genomic architecture such as short inverted repeat sequences facilitates this process in mammalian cells. However, the prevalence of DNA palindromes in cancer cells is currently unknown. To determine the prevalence of DNA palindromes in human cancer cells, we have developed a new microarray-based approach called Genome-wide Analysis of Palindrome Formation (GAPF, Tanaka et al., Nat Genet 2005; 37: 320-7). This approach is based on a relatively simple and efficient method to purify "snap-back DNA" from large DNA palindromes by intramolecular base-pairing, followed by elimination of single-stranded DNA by nuclease S1. Comparison of Genome-wide Analysis of Palindrome Formation profiles between cancer and normal cells using microarray can identify genome-wide distributions of somatic palindromes. Using a human cDNA microarray, we have shown that DNA palindromes occur frequently in human cancer cell lines and primary medulloblastomas. Significant overlap of the loci containing DNA palindromes between Colo320DM and MCF7 cancer cell lines suggests regions in the genome susceptible to chromosome breaks and palindrome formation. A subset of loci containing palindromes is associated with gene amplification in Colo320DM, indicating that the location of palindromes in the cancer genome serves as a structural platform that supports subsequent gene amplification.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号