首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 9 毫秒
1.
2.
Resolution of the two haplotypes present in an individual that is heterozygous at a locus has been a difficult problem for nucleotide sequence-based population genetic studies. Here, we demonstrate a method in which allele-specific polymerase chain reaction (AS-PCR) and computational phasing are combined for relatively high-throughput, efficient resolution of phase in resequencing studies. Using data from multiple loci that were fully experimentally phased, we demonstrate that the popular computational tool PHASE can accurately phase heterozygous individuals with common SNPs (single nucleotide polymorphisms) and/or common haplotypes. However, we also demonstrate that experimental phasing with AS-PCR can efficiently supplement computational phasing, providing a rapid means to phase individuals with rare SNPs or haplotypes and with heterozygous insertion/deletion polymorphisms. By following simple stepwise procedures, AS-PCR can result in much more efficient and accurate experimental phasing of haplotypes than is possible with traditional methods such as cloning.  相似文献   

3.
4.
Cassava (Manihot esculenta Crantz, 2n = 36) is a global food security crop. It has a highly heterozygous genome, high genetic load, and genotype-dependent asynchronous flowering. It is typically propagated by stem cuttings and any genetic variation between haplotypes, including large structural variations, is preserved by such clonal propagation. Traditional genome assembly approaches generate a collapsed haplotype representation of the genome. In highly heterozygous plants, this results in artifacts and an oversimplification of heterozygous regions. We used a combination of Pacific Biosciences (PacBio), Illumina, and Hi-C to resolve each haplotype of the genome of a farmer-preferred cassava line, TME7 (Oko-iyawo). PacBio reads were assembled using the FALCON suite. Phase switch errors were corrected using FALCON-Phase and Hi-C read data. The ultralong-range information from Hi-C sequencing was also used for scaffolding. Comparison of the two phases revealed >5000 large haplotype-specific structural variants affecting over 8 Mb, including insertions and deletions spanning thousands of base pairs. The potential of these variants to affect allele-specific expression was further explored. RNA-sequencing data from 11 different tissue types were mapped against the scaffolded haploid assembly and gene expression data are incorporated into our existing easy-to-use web-based interface to facilitate use by the broader plant science community. These two assemblies provide an excellent means to study the effects of heterozygosity, haplotype-specific structural variation, gene hemizygosity, and allele-specific gene expression contributing to important agricultural traits and further our understanding of the genetics and domestication of cassava.  相似文献   

5.
Asparagus kiusianus is a disease-resistant dioecious plant species and a wild relative of garden asparagus (Asparagus officinalis). To enhance A. kiusianus genomic resources, advance plant science, and facilitate asparagus breeding, we determined the genome sequences of the male and female lines of A. kiusianus. Genome sequence reads obtained with a linked-read technology were assembled into four haplotype-phased contig sequences (∼1.6 Gb each) for the male and female lines. The contig sequences were aligned onto the chromosome sequences of garden asparagus to construct pseudomolecule sequences. Approximately 55,000 potential protein-encoding genes were predicted in each genome assembly, and ∼70% of the genome sequence was annotated as repetitive. Comparative analysis of the genomes of the two species revealed structural and sequence variants between the two species as well as between the male and female lines of each species. Genes with high sequence similarity with the male-specific sex determinant gene in A. officinalis, MSE1/AoMYB35/AspTDF1, were presented in the genomes of the male line but absent from the female genome assemblies. Overall, the genome sequence assemblies, gene sequences, and structural and sequence variants determined in this study will reveal the genetic mechanisms underlying sexual differentiation in plants, and will accelerate disease-resistance breeding in garden asparagus.  相似文献   

6.
Artificial insemination in the C3HeB/FeJ inbred strain of mice has been shown to be more successful at the middle and end of the calendar year. The reasons are twofold: 1) an increase in the number of normal estrous cycles exhibited by females and 2) an increase in the tightness of the phasing of ovarian and vaginal events. The latter phenomenon was found to be the key to the success of artificial insemination, since it permitted the use of vaginal smears to predict accurately the time females could be expected to ovulate and, therefore, the appropriate time for artificial insemination. Seasonal variations in the frequency of estrous cycling also have been observed in SJL/J and B6D2F1/J females.  相似文献   

7.
While standard DNA‐sequencing approaches readily yield genotypic sequence data, haplotype information is often of greater utility for population genetic analyses. However, obtaining individual haplotype sequences can be costly and time‐consuming and sometimes requires statistical reconstruction approaches that are subject to bias and error. Advancements have recently been made in determining individual chromosomal sequences in large‐scale genomic studies, yet few options exist for obtaining this information from large numbers of highly polymorphic individuals in a cost‐effective manner. As a solution, we developed a simple PCR‐based method for obtaining sequence information from individual DNA strands using standard laboratory equipment. The method employs a water‐in‐oil emulsion to separate the PCR mixture into thousands of individual microreactors. PCR within these small vesicles results in amplification from only a single starting DNA template molecule and thus a single haplotype. We improved upon previous approaches by including SYBR Green I and a melted agarose solution in the PCR, allowing easy identification and separation of individually amplified DNA molecules. We demonstrate the use of this method on a highly polymorphic estuarine population of the copepod Eurytemora affinis for which current molecular and computational methods for haplotype determination have been inadequate.  相似文献   

8.
For half a century population genetics studies have put type II restriction endonucleases to work. Now, coupled with massively‐parallel, short‐read sequencing, the family of RAD protocols that wields these enzymes has generated vast genetic knowledge from the natural world. Here, we describe the first software natively capable of using paired‐end sequencing to derive short contigs from de novo RAD data. Stacks version 2 employs a de Bruijn graph assembler to build and connect contigs from forward and reverse reads for each de novo RAD locus, which it then uses as a reference for read alignments. The new architecture allows all the individuals in a metapopulation to be considered at the same time as each RAD locus is processed. This enables a Bayesian genotype caller to provide precise SNPs, and a robust algorithm to phase those SNPs into long haplotypes, generating RAD loci that are 400–800 bp in length. To prove its recall and precision, we tested the software with simulated data and compared reference‐aligned and de novo analyses of three empirical data sets. Our study shows that the latest version of Stacks is highly accurate and outperforms other software in assembling and genotyping paired‐end de novo data sets.  相似文献   

9.
Short read sequencing of diploid individuals does not permit the direct inference of the sequence on each of the two homologous chromosomes. Although various phasing software packages exist, they were primarily tailored for and tested on human data, which differ from other species in factors that influence phasing, such as SNP density, amounts of linkage disequilibrium (LD) and sample sizes. Despite becoming increasingly popular for other species, the reliability of phasing in non‐human data has not been evaluated to a sufficient extent. We scrutinized the phasing accuracy for Drosophila melanogaster, a species with high polymorphism levels and reduced LD relative to humans. We phased two D. melanogaster populations and compared the results to the known haplotypes. The performance increased with size of the reference panel and was highest when the reference panel and phased individuals were from the same population. Full genomic SNP data and inclusion of sequence read information also improved phasing. Despite humans and Drosophila having similar switch error rates between polymorphic sites, the distances between switch errors were much shorter in Drosophila with only fragments <300–1500 bp being correctly phased with ≥95% confidence. This suggests that the higher SNP density cannot compensate for the higher recombination rate in D. melanogaster. Furthermore, we show that populations that have gone through demographic events such as bottlenecks can be phased with higher accuracy. Our results highlight that statistically phased data are particularly error prone in species with large population sizes or populations lacking suitable reference panels.  相似文献   

10.
11.
L-proline is an amino acid that plays an important role in proteins uniquely contributing to protein folding, structure, and stability, and this amino acid serves as a sequence-recognition motif. Proline biosynthesis can occur via two pathways, one from glutamate and the other from arginine. In both pathways, the last step of biosynthesis, the conversion of delta1-pyrroline-5-carboxylate (P5C) to L-proline, is catalyzed by delta1-pyrroline-5-carboxylate reductase (P5CR) using NAD(P)H as a cofactor. We have determined the first crystal structure of P5CR from two human pathogens, Neisseria meningitides and Streptococcus pyogenes, at 2.0 angstroms and 2.15 angstroms resolution, respectively. The catalytic unit of P5CR is a dimer composed of two domains, but the biological unit seems to be species-specific. The N-terminal domain of P5CR is an alpha/beta/alpha sandwich, a Rossmann fold. The C-terminal dimerization domain is rich in alpha-helices and shows domain swapping. Comparison of the native structure of P5CR to structures complexed with L-proline and NADP+ in two quite different primary sequence backgrounds provides unique information about key functional features: the active site and the catalytic mechanism. The inhibitory L-proline has been observed in the crystal structure.  相似文献   

12.
Various populations have contributed to the present-day gene pool in oriental Mediterranean (Aegean Sea) and are well documented for ancient history. The primary objective of the study is to report on the analysis of the paternal component of the variation (Y chromosome haplotypes) in contemporary populations in Greece, Crete, Turkey and Cyprus. A total of 245 males who hailed from five different locations in Turkey, Greece, and the islands of Crete and Cyprus were analyzed for Y-chromosome-specific haplotypes based on p49a,f TaqI polymorphism. The main haplotype observed (21.2%) in the Greek–Turkish area is haplotype VII. The second haplotype in terms of frequency (13.5%) is haplotype VIII, which is characteristic of Semitic populations. The third (11.4%), fourth (6.9%) and fifth (5.7%) haplotypes in frequency are haplotype XI (a typical eastern European haplotype), haplotype V (the North African haplotype) and haplotype XV (the Western European haplotype), respectively. The distribution of haplotype VII is significantly heterogeneous genetically among the five localities studied, with a peak of frequency (43.8%) in Crete. It is proposed that haplotype VII reflects the ancient Minoan civilization. Haplotype VII frequencies actually known are mapped in countries surrounding the Mediterranean Sea.  相似文献   

13.
贵州从江侗族Y-DNA及线粒体DNA 序列多态性分析   总被引:3,自引:3,他引:3  
为分析贵州从江侗族父系及母系遗传结构,探讨其起源及迁徒, 通过聚合酶链式反应-限制性片段长度多态性(PCR-RFLP), 研究贵州从江侗族无亲缘关系个体由10个单核苷酸位点(SNPs)组成的Y染色体单倍型及11个单核苷酸位点组成的线粒体DNA单倍群频率。结果显示, 从40份男性样本的Y-SNP基因分型中,得到H6 、H11、H14 共3种单倍型;H11的频率为92.5%;通过对线粒体DNA基因分型,得到6种单倍群,有75%的个体能明确分类其所携带的单倍群特征,说明贵州从江侗族父系遗传构成相对简单。通过主成分分析,证明贵州从江侗族与其他的壮侗语族人群相聚,母系遗传结构复杂,无C单倍群分布可能为该民族特征之一。Abstract: To study the patrilineal and matrilineal genetic structure and the origin of Dong Ethnic of Congjiang Guizhou. Study the distribution of Y-chromosome haplotypes which consisted of 10 SNPs of Y-DNA and mtDNA haplogroups consisted of 11 SNPs by using PCR-RFLP method. The result is three haplotypes H6,H11,H14 were detected, the frequency of H11 is 92.5%. Six haplogroups were identified by mtDNA analysis, 75% of the people can be identified. The patrilineal genetic structure of Dong of Guizhou is simple, Principle component indicated that the structure is closer to Zhuang-Dong branch of Sino-Tibetan language family. The matrilineal genetic structure of Dong of Guizhou is complicated.  相似文献   

14.
单倍型分析技术研究进展   总被引:1,自引:0,他引:1  
单倍型是指共存于单条染色体上的一系列遗传变异位点的组合,每条染色体都有自己独特的单倍型。单倍型分析技术作为一种常用的数据分析方法,是寻找单染色体上杂合SNP变异位点的有效方法,也对挖掘致病基因、寻找疾病治疗新方法有重要作用。它主要包括间接推断法和直接实验法。文中介绍了各种单倍型分析方法及应用,尤其详细介绍了单分子稀释法和保留邻近性的转座酶测序法,同时对单倍型分析技术的应用前景进行了展望。  相似文献   

15.
In this report we highlight the latest trends in phasing methods used to solve alpha helical membrane protein structures and analyze the use of heavy atom metals for the purpose of experimental phasing. Our results reveal that molecular replacement is emerging as the most successful method for phasing alpha helical membrane proteins, with the notable exception of the transporter family, where experimentally derived phase information still remains the most effective method. To facilitate selection of heavy atoms salts for experimental phasing an analysis of these was undertaken and indicates that organic mercury salts are still the most successful heavy atoms reagents. Interestingly the use of seleno‐l ‐methionine incorporated protein has increased since earlier studies into membrane protein phasing, so too the use of SAD and MAD as techniques for phase determination. Taken together this study provides a brief snapshot of phasing methods for alpha helical membrane proteins and suggests possible routes for heavy atom selection and phasing methods based on currently available data.  相似文献   

16.
  1. Download : Download high-res image (58KB)
  2. Download : Download full-size image
  相似文献   

17.
Conservation and Periodicity of DNA Bend Sites in Eukaryotic Genomes   总被引:2,自引:0,他引:2  
DNA bend sites appear every 680 bp on average in the human -and ß-globin gene regions. Although most of theirmolecular nature has not been unraveled, a potential bend coresequence A2N8A2N8A2 (A/A/A) and its complementary T2N8T2N8T2(T/T/T) appeared preferentially either in or very close to mostof the bend sites, whereas other combinations of A2 and T2 dinucleotides,A/T/T + A/A/T, T/T/A + T/A/A and A/T/A + T/A/T, did not. Thedistances between any two of the core sequences in the entireß-globin locus showed a strong bias to a length of701–800 bp and multiples thereof, suggesting that thereis periodicity throughout the locus. This bias was not foundfor other combinations of A2 and T2. Again, this periodicitywas identified in many eukaryotic genes, whereas the tendencywas absent in mRNAs and prokaryotic as well as viral genomes.  相似文献   

18.
Hanli Xu  Yongtao Guan 《Genetics》2014,197(3):823-838
A novel haplotype association method is presented, and its power is demonstrated. Relying on a statistical model for linkage disequilibrium (LD), the method first infers ancestral haplotypes and their loadings at each marker for each individual. The loadings are then used to quantify local haplotype sharing between individuals at each marker. A statistical model was developed to link the local haplotype sharing and phenotypes to test for association. We devised a novel method to fit the LD model, reducing the complexity from putatively quadratic to linear (in the number of ancestral haplotypes). Therefore, the LD model can be fitted to all study samples simultaneously, and, consequently, our method is applicable to big data sets. Compared to existing haplotype association methods, our method integrated out phase uncertainty, avoided arbitrariness in specifying haplotypes, and had the same number of tests as the single-SNP analysis. We applied our method to data from the Wellcome Trust Case Control Consortium and discovered eight novel associations between seven gene regions and five disease phenotypes. Among these, GRIK4, which encodes a protein that belongs to the glutamate-gated ionic channel family, is strongly associated with both coronary artery disease and rheumatoid arthritis. A software package implementing methods described in this article is freely available at http://www.haplotype.org.  相似文献   

19.
The presence of heterozygous indels in a DNA sequence usually results in the sequence being discarded. If the sequence trace is of high enough quality, however, it will contain enough information to reconstruct the two constituent sequences with very little ambiguity. Solutions already exist using comparisons with a known reference sequence, but this is often unavailable for nonmodel organisms or novel DNA regions. I present a program which determines the sizes and positions of heterozygous indels in a DNA sequence and reconstructs the two constituent haploid sequences. No external data such as a reference sequence or other prior knowledge are required. Simulation suggests an accuracy of >99% from a single read, with errors being eliminable by the inclusion of a second sequencing read, such as one using a reverse primer. Diploid sequences can be fully reconstructed across any number of heterozygous indels, with two overlapping sequencing reads almost always sufficient to infer the entire DNA sequence. This eliminates the need for costly and laborious cloning, and allows data to be used which would otherwise be discarded. With no more laboratory work than is needed to produce two normal sequencing reads, two aligned haploid sequences can be produced quickly and accurately and with extensive phasing information.  相似文献   

20.
人类基因组单核苷酸多态性和单体型的分析及应用   总被引:9,自引:0,他引:9  
单核苷酸多态性是人类基因组中最丰富的遗传变异。单体型是指位于一条染色体上或某一区域的一组相关联的SNP等位位点,单体型已经成为近年来人类遗传研究的组成部分。人类基因组单体型图(HapMap)计划的目标就是构建人类DNA序列中多态位点的常见模式,找出代表整个人类基因图谱之中的SNP集合的标签SNP。在复杂性疾病研究中,由多个变异位点组合构成的单体型分析优于单个SNP的分析。文章论述了SNPs、基因型、表现型的定义与HapMap计划的一些情况,综述了单体型的3种推断算法和单体域的不同定义与构建方法,同时介绍了标签SNP的选择及单体型与复杂疾病关联分析的方法,可利用公共SNP数据库的情况以及SNPs与单体型在复杂疾病与药物反应方面的应用。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号