共查询到20条相似文献,搜索用时 0 毫秒
1.
SLAF-seq: An Efficient Method of Large-Scale De Novo SNP Discovery and Genotyping Using High-Throughput Sequencing 总被引:2,自引:0,他引:2
Xiaowen Sun Dongyuan Liu Xiaofeng Zhang Wenbin Li Hui Liu Weiguo Hong Chuanbei Jiang Ning Guan Chouxian Ma Huaping Zeng Chunhua Xu Jun Song Long Huang Chunmei Wang Junjie Shi Rui Wang Xianhu Zheng Cuiyun Lu Xiaowu Wang Hongkun Zheng 《PloS one》2013,8(3)
Large-scale genotyping plays an important role in genetic association studies. It has provided new opportunities for gene discovery, especially when combined with high-throughput sequencing technologies. Here, we report an efficient solution for large-scale genotyping. We call it specific-locus amplified fragment sequencing (SLAF-seq). SLAF-seq technology has several distinguishing characteristics: i) deep sequencing to ensure genotyping accuracy; ii) reduced representation strategy to reduce sequencing costs; iii) pre-designed reduced representation scheme to optimize marker efficiency; and iv) double barcode system for large populations. In this study, we tested the efficiency of SLAF-seq on rice and soybean data. Both sets of results showed strong consistency between predicted and practical SLAFs and considerable genotyping accuracy. We also report the highest density genetic map yet created for any organism without a reference genome sequence, common carp in this case, using SLAF-seq data. We detected 50,530 high-quality SLAFs with 13,291 SNPs genotyped in 211 individual carp. The genetic map contained 5,885 markers with 0.68 cM intervals on average. A comparative genomics study between common carp genetic map and zebrafish genome sequence map showed high-quality SLAF-seq genotyping results. SLAF-seq provides a high-resolution strategy for large-scale genotyping and can be generally applicable to various species and populations. 相似文献
2.
Alex R. Hastie Lingli Dong Alexis Smith Jeff Finklestein Ernest T. Lam Naxin Huo Han Cao Pui-Yan Kwok Karin R. Deal Jan Dvorak Ming-Cheng Luo Yong Gu Ming Xiao 《PloS one》2013,8(2)
Next-generation sequencing (NGS) technologies have enabled high-throughput and low-cost generation of sequence data; however, de novo genome assembly remains a great challenge, particularly for large genomes. NGS short reads are often insufficient to create large contigs that span repeat sequences and to facilitate unambiguous assembly. Plant genomes are notorious for containing high quantities of repetitive elements, which combined with huge genome sizes, makes accurate assembly of these large and complex genomes intractable thus far. Using two-color genome mapping of tiling bacterial artificial chromosomes (BAC) clones on nanochannel arrays, we completed high-confidence assembly of a 2.1-Mb, highly repetitive region in the large and complex genome of Aegilops tauschii, the D-genome donor of hexaploid wheat (Triticum aestivum). Genome mapping is based on direct visualization of sequence motifs on single DNA molecules hundreds of kilobases in length. With the genome map as a scaffold, we anchored unplaced sequence contigs, validated the initial draft assembly, and resolved instances of misassembly, some involving contigs <2 kb long, to dramatically improve the assembly from 75% to 95% complete. 相似文献
3.
4.
5.
Association Mapping in Outbred Populations: Power and Efficiency When Genotyping Parents and Phenotyping Progeny 下载免费PDF全文
We develop expressions for the power to detect associations between parental genotypes and offspring phenotypes for quantitative traits. Three different “indirect” experimental designs are considered: full-sib, half-sib, and full-sib–half-sib families. We compare the power of these designs to detect genotype–phenotype associations relative to the common, “direct,” approach of genotyping and phenotyping the same individuals. When heritability is low, the indirect designs can outperform the direct method. However, the extra power comes at a cost due to an increased phenotyping effort. By developing expressions for optimal experimental designs given the cost of phenotyping relative to genotyping, we show how the extra costs associated with phenotyping a large number of individuals will influence experimental design decisions. Our results suggest that indirect association studies can be a powerful means of detecting allelic associations in outbred populations of species for which genotyping and phenotyping the same individuals is impractical and for life history and behavioral traits that are heavily influenced by environmental variance and therefore best measured on groups of individuals. Indirect association studies are likely to be favored only on purely economical grounds, however, when phenotyping is substantially less expensive than genotyping. A web-based application implementing our expressions has been developed to aid in the design of indirect association studies. 相似文献
6.
Pharmacogenetic research benefits first-hand from the abundance of information provided by the completion of the Human Genome Project. With such a tremendous amount of data available comes an explosion of genotyping methods. Pyrosequencing(R) is one of the most thorough yet simple methods to date used to analyze polymorphisms. It also has the ability to identify tri-allelic, indels, short-repeat polymorphisms, along with determining allele percentages for methylation or pooled sample assessment. In addition, there is a standardized control sequence that provides internal quality control. This method has led to rapid and efficient single-nucleotide polymorphism evaluation including many clinically relevant polymorphisms. The technique and methodology of Pyrosequencing is explained.
Download video file.(99M, mov) 相似文献
7.
Samuel P. Strom Michael J. Clark Ariadna Martinez Sarah Garcia Amira A. Abelazeem Anna Matynia Sachin Parikh Lori S. Sullivan Sara J. Bowne Stephen P. Daiger Michael B. Gorin 《PloS one》2016,11(3)
Background
Retinitis pigmentosa is a phenotype with diverse genetic causes. Due to this genetic heterogeneity, genome-wide identification and analysis of protein-altering DNA variants by exome sequencing is a powerful tool for novel variant and disease gene discovery. In this study, exome sequencing analysis was used to search for potentially causal DNA variants in a two-generation pedigree with apparent dominant retinitis pigmentosa.Methods
Variant identification and analysis of three affected members (mother and two affected offspring) was performed via exome sequencing. Parental samples of the index case were used to establish inheritance. Follow-up testing of 94 additional retinitis pigmentosa pedigrees was performed via retrospective analysis or Sanger sequencing.Results and Conclusions
A total of 136 high quality coding variants in 123 genes were identified which are consistent with autosomal dominant disease. Of these, one of the strongest genetic and functional candidates is a c.269A>G (p.Tyr90Cys) variant in ARL3. Follow-up testing established that this variant occurred de novo in the index case. No additional putative causal variants in ARL3 were identified in the follow-up cohort, suggesting that if ARL3 variants can cause adRP it is an extremely rare phenomenon. 相似文献8.
Remarkable advances in DNA sequencing technology have created a need for de novo genome assembly methods tailored to work with the new sequencing data types. Many such methods have been published in recent years, but assembling raw sequence data to obtain a draft genome has remained a complex, multi-step process, involving several stages of sequence data cleaning, error correction, assembly, and quality control. Successful application of these steps usually requires intimate knowledge of a diverse set of algorithms and software. We present an assembly pipeline called A5 (Andrew And Aaron''s Awesome Assembly pipeline) that simplifies the entire genome assembly process by automating these stages, by integrating several previously published algorithms with new algorithms for quality control and automated assembly parameter selection. We demonstrate that A5 can produce assemblies of quality comparable to a leading assembly algorithm, SOAPdenovo, without any prior knowledge of the particular genome being assembled and without the extensive parameter tuning required by the other assembly algorithm. In particular, the assemblies produced by A5 exhibit 50% or more reduction in broken protein coding sequences relative to SOAPdenovo assemblies. The A5 pipeline can also assemble Illumina sequence data from libraries constructed by the Nextera (transposon-catalyzed) protocol, which have markedly different characteristics to mechanically sheared libraries. Finally, A5 has modest compute requirements, and can assemble a typical bacterial genome on current desktop or laptop computer hardware in under two hours, depending on depth of coverage. 相似文献
9.
10.
11.
12.
Udai P. Singh Raj K. Singh Yashuhiro Isogai Yoshitsugu Shiro 《International journal of peptide research and therapeutics》2006,12(4):379-385
The de novo peptide with 63-residues (MHB) has been synthesized biochemically and used for the binding of manganese (II) ions. In designed peptide, the leucine of the peptide dA1 (prototype) was replaced by His27 and Asp41 for binding the manganese (II) ions. The different chromatography studies and mass determination showed that new peptide folds into a monomeric, highly helical with a active site structure similar to the native Mn–SOD in an aqueous solution. Electron paramagnetic resonance (EPR) study suggested that the peptide binds single manganese (II) ion per molecule loosely with K
D value of about 36 μM. The circular dichroism (CD) studies demonstrated that the helical contents of the peptide did not change significantly even after binding the metal ions. The SOD activity study of the Mn–peptide complex showed that the IC50 values is 8.08 μM. 相似文献
13.
Ivan Coluzza 《PloS one》2014,9(12)
Protein folding and design are major biophysical problems, the solution of which would lead to important applications especially in medicine. Here we provide evidence of how a novel parametrization of the Caterpillar model may be used for both quantitative protein design and folding. With computer simulations it is shown that, for a large set of real protein structures, the model produces designed sequences with similar physical properties to the corresponding natural occurring sequences. The designed sequences require further experimental testing. For an independent set of proteins, previously used as benchmark, the correct folded structure of both the designed and the natural sequences is also demonstrated. The equilibrium folding properties are characterized by free energy calculations. The resulting free energy profiles not only are consistent among natural and designed proteins, but also show a remarkable precision when the folded structures are compared to the experimentally determined ones. Ultimately, the updated Caterpillar model is unique in the combination of its fundamental three features: its simplicity, its ability to produce natural foldable designed sequences, and its structure prediction precision. It is also remarkable that low frustration sequences can be obtained with such a simple and universal design procedure, and that the folding of natural proteins shows funnelled free energy landscapes without the need of any potentials based on the native structure. 相似文献
14.
Xin He Stephan J. Sanders Li Liu Silvia De Rubeis Elaine T. Lim James S. Sutcliffe Gerard D. Schellenberg Richard A. Gibbs Mark J. Daly Joseph D. Buxbaum Matthew W. State Bernie Devlin Kathryn Roeder 《PLoS genetics》2013,9(8)
De novo mutations affect risk for many diseases and disorders, especially those with early-onset. An example is autism spectrum disorders (ASD). Four recent whole-exome sequencing (WES) studies of ASD families revealed a handful of novel risk genes, based on independent de novo loss-of-function (LoF) mutations falling in the same gene, and found that de novo LoF mutations occurred at a twofold higher rate than expected by chance. However successful these studies were, they used only a small fraction of the data, excluding other types of de novo mutations and inherited rare variants. Moreover, such analyses cannot readily incorporate data from case-control studies. An important research challenge in gene discovery, therefore, is to develop statistical methods that accommodate a broader class of rare variation. We develop methods that can incorporate WES data regarding de novo mutations, inherited variants present, and variants identified within cases and controls. TADA, for Transmission And De novo Association, integrates these data by a gene-based likelihood model involving parameters for allele frequencies and gene-specific penetrances. Inference is based on a Hierarchical Bayes strategy that borrows information across all genes to infer parameters that would be difficult to estimate for individual genes. In addition to theoretical development we validated TADA using realistic simulations mimicking rare, large-effect mutations affecting risk for ASD and show it has dramatically better power than other common methods of analysis. Thus TADA''s integration of various kinds of WES data can be a highly effective means of identifying novel risk genes. Indeed, application of TADA to WES data from subjects with ASD and their families, as well as from a study of ASD subjects and controls, revealed several novel and promising ASD candidate genes with strong statistical support. 相似文献
15.
Jiehan Li Edward Daly Enrico Campioli Martin Wabitsch Vassilios Papadopoulos 《The Journal of biological chemistry》2014,289(2):747-764
Local production and action of cholesterol metabolites such as steroids or oxysterols within endocrine tissues are currently recognized as an important principle in the cell type- and tissue-specific regulation of hormone effects. In adipocytes, one of the most abundant endocrine cells in the human body, the de novo production of steroids or oxysterols from cholesterol has not been examined. Here, we demonstrate that essential components of cholesterol transport and metabolism machinery in the initial steps of steroid and/or oxysterol biosynthesis pathways are present and active in adipocytes. The ability of adipocyte CYP11A1 in producing pregnenolone is demonstrated for the first time, rendering adipocyte a steroidogenic cell. The oxysterol 27-hydroxycholesterol (27HC), synthesized by the mitochondrial enzyme CYP27A1, was identified as one of the major de novo adipocyte products from cholesterol and its precursor mevalonate. Inhibition of CYP27A1 activity or knockdown and deletion of the Cyp27a1 gene induced adipocyte differentiation, suggesting a paracrine or autocrine biological significance for the adipocyte-derived 27HC. These findings suggest that the presence of the 27HC biosynthesis pathway in adipocytes may represent a defense mechanism to prevent the formation of new fat cells upon overfeeding with dietary cholesterol. 相似文献
16.
17.
ShapeR is an open source software package that runs on the R platform and is specifically designed to study otolith shape variation among fish populations. The package extends previously described software used for otolith shape analysis by allowing the user to automatically extract closed contour outlines from a large number of images, perform smoothing to eliminate pixel noise, choose from conducting either a Fourier or Wavelet transform to the outlines and visualize the mean shape. The output of the package are independent Fourier or Wavelet coefficients which can be directly imported into a wide range of statistical packages in R. The package might prove useful in studies of any two dimensional objects. 相似文献
18.
Marc?A. Coram Sophie?I. Candille Qing Duan Kei?Hang?K. Chan Yun Li Charles Kooperberg Alex?P. Reiner Hua Tang 《American journal of human genetics》2015,96(5):740-752
Elucidating the genetic basis of complex traits and diseases in non-European populations is particularly challenging because US minority populations have been under-represented in genetic association studies. We developed an empirical Bayes approach named XPEB (cross-population empirical Bayes), designed to improve the power for mapping complex-trait-associated loci in a minority population by exploiting information from genome-wide association studies (GWASs) from another ethnic population. Taking as input summary statistics from two GWASs—a target GWAS from an ethnic minority population of primary interest and an auxiliary base GWAS (such as a larger GWAS in Europeans)—our XPEB approach reprioritizes SNPs in the target population to compute local false-discovery rates. We demonstrated, through simulations, that whenever the base GWAS harbors relevant information, XPEB gains efficiency. Moreover, XPEB has the ability to discard irrelevant auxiliary information, providing a safeguard against inflated false-discovery rates due to genetic heterogeneity between populations. Applied to a blood-lipids study in African Americans, XPEB more than quadrupled the discoveries from the conventional approach, which used a target GWAS alone, bringing the number of significant loci from 14 to 65. Thus, XPEB offers a flexible framework for mapping complex traits in minority populations. 相似文献
19.
Timothy M. Beissinger Candice N. Hirsch Rajandeep S. Sekhon Jillian M. Foerster James M. Johnson German Muttoni Brieanne Vaillancourt C. Robin Buell Shawn M. Kaeppler Natalia de Leon 《Genetics》2013,193(4):1073-1081
Genotyping-by-sequencing (GBS) approaches provide low-cost, high-density genotype information. However, GBS has unique technical considerations, including a substantial amount of missing data and a nonuniform distribution of sequence reads. The goal of this study was to characterize technical variation using this method and to develop methods to optimize read depth to obtain desired marker coverage. To empirically assess the distribution of fragments produced using GBS, ∼8.69 Gb of GBS data were generated on the Zea mays reference inbred B73, utilizing ApeKI for genome reduction and single-end reads between 75 and 81 bp in length. We observed wide variation in sequence coverage across sites. Approximately 76% of potentially observable cut site-adjacent sequence fragments had no sequencing reads whereas a portion had substantially greater read depth than expected, up to 2369 times the expected mean. The methods described in this article facilitate determination of sequencing depth in the context of empirically defined read depth to achieve desired marker density for genetic mapping studies. 相似文献