共查询到20条相似文献,搜索用时 0 毫秒
1.
Zhenglin Du Liang Ma Hongzhu Qu Wei Chen Bing Zhang Xi Lu Weibo Zhai Xin Sheng Yongqiao Sun Wenjie Li Meng Lei Qiuhui Qi Na Yuan Shuo Shi Jingyao Zeng Jinyue Wang Yadong Yang Qi Liu Yaqiang Hong Lili Dong Zhewen Zhang Dong Zou Yanqing Wang Shuhui Song Fan Liu Xiangdong Fang Hua Chen Xin Liu Jingfa Xiao Changqing Zeng 《基因组蛋白质组与生物信息学报(英文版)》2019,17(3):229-247
To unravel the genetic mechanisms of disease and physiological traits,it requires comprehensive sequencing analysis of large sample size in Chinese populations.Here,we report the primary results of the Chinese Academy of Sciences Precision Medicine Initiative(CASPMI) project launched by the Chinese Academy of Sciences,including the de novo assembly of a northern Han reference genome(NH1.0) and whole genome analyses of 597 healthy people coming from most areas in China.Given the two existing reference genomes for Han Chinese(YH and HX1) were both from the south,we constructed NH1.0,a new reference genome from a northern individual,by combining the sequencing strategies of Pac Bio,10? Genomics,and Bionano mapping.Using this integrated approach,we obtained an N50 scaffold size of 46.63 Mb for the NH1.0 genome and performed a comparative genome analysis of NH1.0 with YH and HX1.In order to generate a genomic variation map of Chinese populations,we performed the whole-genome sequencing of597 participants and identified 24.85 million(M) single nucleotide variants(SNVs),3.85 M small indels,and 106,382 structural variations.In the association analysis with collected phenotypes,we found that the T allele of rs1549293 in KAT8 significantly correlated with the waist circumference in northern Han males.Moreover,significant genetic diversity in MTHFR,TCN2,FADS1,and FADS2,which associate with circulating folate,vitamin B12,or lipid metabolism,was observed between northerners and southerners.Especially,for the homocysteine-increasing allele of rs1801133(MTHFR 677 T),we hypothesize that there exists a ‘‘comfort" zone for a high frequency of 677 T between latitudes of 35–45 degree North.Taken together,our results provide a high-quality northern Han reference genome and novel population-specific data sets of genetic variants for use in the personalized and precision medicine. 相似文献
2.
Benjamin Georgi David Craig Rachel L. Kember Wencheng Liu Ingrid Lindquist Sara Nasser Christopher Brown Janice A. Egeland Steven M. Paul Maja Bu?an 《PLoS genetics》2014,10(3)
Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders. 相似文献
3.
Chenxi Jia Limei Hui Weifeng Cao Christopher B. Lietz Xiaoyue Jiang Ruibing Chen Adam D. Catherman Paul M. Thomas Ying Ge Neil L. Kelleher Lingjun Li 《Molecular & cellular proteomics : MCP》2012,11(12):1951-1964
A complete understanding of the biological functions of large signaling peptides (>4 kDa) requires comprehensive characterization of their amino acid sequences and post-translational modifications, which presents significant analytical challenges. In the past decade, there has been great success with mass spectrometry-based de novo sequencing of small neuropeptides. However, these approaches are less applicable to larger neuropeptides because of the inefficient fragmentation of peptides larger than 4 kDa and their lower endogenous abundance. The conventional proteomics approach focuses on large-scale determination of protein identities via database searching, lacking the ability for in-depth elucidation of individual amino acid residues. Here, we present a multifaceted MS approach for identification and characterization of large crustacean hyperglycemic hormone (CHH)-family neuropeptides, a class of peptide hormones that play central roles in the regulation of many important physiological processes of crustaceans. Six crustacean CHH-family neuropeptides (8–9.5 kDa), including two novel peptides with extensive disulfide linkages and PTMs, were fully sequenced without reference to genomic databases. High-definition de novo sequencing was achieved by a combination of bottom-up, off-line top-down, and on-line top-down tandem MS methods. Statistical evaluation indicated that these methods provided complementary information for sequence interpretation and increased the local identification confidence of each amino acid. Further investigations by MALDI imaging MS mapped the spatial distribution and colocalization patterns of various CHH-family neuropeptides in the neuroendocrine organs, revealing that two CHH-subfamilies are involved in distinct signaling pathways.Neuropeptides and hormones comprise a diverse class of signaling molecules involved in numerous essential physiological processes, including analgesia, reward, food intake, learning and memory (1). Disorders of the neurosecretory and neuroendocrine systems influence many pathological processes. For example, obesity results from failure of energy homeostasis in association with endocrine alterations (2, 3). Previous work from our lab used crustaceans as model organisms found that multiple neuropeptides were implicated in control of food intake, including RFamides, tachykinin related peptides, RYamides, and pyrokinins (4–6).Crustacean hyperglycemic hormone (CHH)1 family neuropeptides play a central role in energy homeostasis of crustaceans (7–17). Hyperglycemic response of the CHHs was first reported after injection of crude eyestalk extract in crustaceans. Based on their preprohormone organization, the CHH family can be grouped into two sub-families: subfamily-I containing CHH, and subfamily-II containing molt-inhibiting hormone (MIH) and mandibular organ-inhibiting hormone (MOIH). The preprohormones of the subfamily-I have a CHH precursor related peptide (CPRP) that is cleaved off during processing; and preprohormones of the subfamily-II lack the CPRP (9). Uncovering their physiological functions will provide new insights into neuroendocrine regulation of energy homeostasis.Characterization of CHH-family neuropeptides is challenging. They are comprised of more than 70 amino acids and often contain multiple post-translational modifications (PTMs) and complex disulfide bridge connections (7). In addition, physiological concentrations of these peptide hormones are typically below picomolar level, and most crustacean species do not have available genome and proteome databases to assist MS-based sequencing.MS-based neuropeptidomics provides a powerful tool for rapid discovery and analysis of a large number of endogenous peptides from the brain and the central nervous system. Our group and others have greatly expanded the peptidomes of many model organisms (3, 18–33). For example, we have discovered more than 200 neuropeptides with several neuropeptide families consisting of as many as 20–40 members in a simple crustacean model system (5, 6, 25–31, 34). However, a majority of these neuropeptides are small peptides with 5–15 amino acid residues long, leaving a gap of identifying larger signaling peptides from organisms without sequenced genome. The observed lack of larger size peptide hormones can be attributed to the lack of effective de novo sequencing strategies for neuropeptides larger than 4 kDa, which are inherently more difficult to fragment using conventional techniques (34–37). Although classical proteomics studies examine larger proteins, these tools are limited to identification based on database searching with one or more peptides matching without complete amino acid sequence coverage (36, 38).Large populations of neuropeptides from 4–10 kDa exist in the nervous systems of both vertebrates and invertebrates (9, 39, 40). Understanding their functional roles requires sufficient molecular knowledge and a unique analytical approach. Therefore, developing effective and reliable methods for de novo sequencing of large neuropeptides at the individual amino acid residue level is an urgent gap to fill in neurobiology. In this study, we present a multifaceted MS strategy aimed at high-definition de novo sequencing and comprehensive characterization of the CHH-family neuropeptides in crustacean central nervous system. The high-definition de novo sequencing was achieved by a combination of three methods: (1) enzymatic digestion and LC-tandem mass spectrometry (MS/MS) bottom-up analysis to generate detailed sequences of proteolytic peptides; (2) off-line LC fractionation and subsequent top-down MS/MS to obtain high-quality fragmentation maps of intact peptides; and (3) on-line LC coupled to top-down MS/MS to allow rapid sequence analysis of low abundance peptides. Combining the three methods overcomes the limitations of each, and thus offers complementary and high-confidence determination of amino acid residues. We report the complete sequence analysis of six CHH-family neuropeptides including the discovery of two novel peptides. With the accurate molecular information, MALDI imaging and ion mobility MS were conducted for the first time to explore their anatomical distribution and biochemical properties. 相似文献
4.
5.
6.
近几年飞速发展的高通量测序技术(next generation sequencing,NGS)在生命科学研究的各个领域充分展现了其低成本、高通量和应用面广等优势。在现代农业生物技术领域,利用高通量测序技术,科学家们不仅能更经济而高效对农作物、模式植物或不同栽培品种进行深入的全基因组测序、重测序,也可以对成百上千的栽培品种进行高效而准确的遗传差异分析、分子标记分析、连锁图谱分析、表观遗传学分析、转录组分析,进而改进农作物的育种技术,加快新品种的育种研究。其中,获得农作物的全基因组序列是其他研究和分析的基础。本文通过介绍近年来发表的一些利用高通量测序技术进行的农作物全基因组测定和组装的工作,展示高通量测序技术在现代农业生物技术领域的广泛前景以及其建立起来的研究基础。 相似文献
7.
8.
Irina N. Marinova Jacob Engelbrecht Adrian Ewald Lasse L. Langholm Christian Holmberg Birthe B. Kragelund Colin Gordon Olaf Nielsen Rasmus Hartmann-Petersen 《PloS one》2015,10(2)
The protein called p97 in mammals and Cdc48 in budding and fission yeast is a homo-hexameric, ring-shaped, ubiquitin-dependent ATPase complex involved in a range of cellular functions, including protein degradation, vesicle fusion, DNA repair, and cell division. The cdc48+ gene is essential for viability in fission yeast, and point mutations in the human orthologue have been linked to disease. To analyze the function of p97/Cdc48 further, we performed a screen for cold-sensitive suppressors of the temperature-sensitive cdc48-353 fission yeast strain. In total, 29 independent pseudo revertants that had lost the temperature-sensitive growth defect of the cdc48-353 strain were isolated. Of these, 28 had instead acquired a cold-sensitive phenotype. Since the suppressors were all spontaneous mutants, and not the result of mutagenesis induced by chemicals or UV irradiation, we reasoned that the genome sequences of the 29 independent cdc48-353 suppressors were most likely identical with the exception of the acquired suppressor mutations. This prompted us to test if a whole genome sequencing approach would allow us to map the mutations. Indeed genome sequencing unambiguously revealed that the cold-sensitive suppressors were all second site intragenic cdc48 mutants. Projecting these onto the Cdc48 structure revealed that while the original temperature-sensitive G338D mutation is positioned near the central pore in the hexameric ring, the suppressor mutations locate to subunit-subunit and inter-domain boundaries. This suggests that Cdc48-353 is structurally compromized at the restrictive temperature, but re-established in the suppressor mutants. The last suppressor was an extragenic frame shift mutation in the ufd1 gene, which encodes a known Cdc48 co-factor. In conclusion, we show, using a novel whole genome sequencing approach, that Cdc48-353 is structurally compromized at the restrictive temperature, but stabilized in the suppressors. 相似文献
9.
10.
Min Ni Marianna Feretzaki Wenjun Li Anna Floyd-Averette Piotr Mieczkowski Fred S. Dietrich Joseph Heitman 《PLoS biology》2013,11(9)
Aneuploidy is known to be deleterious and underlies several common human diseases, including cancer and genetic disorders such as trisomy 21 in Down''s syndrome. In contrast, aneuploidy can also be advantageous and in fungi confers antifungal drug resistance and enables rapid adaptive evolution. We report here that sexual reproduction generates phenotypic and genotypic diversity in the human pathogenic yeast Cryptococcus neoformans, which is globally distributed and commonly infects individuals with compromised immunity, such as HIV/AIDS patients, causing life-threatening meningoencephalitis. C. neoformans has a defined a-α opposite sexual cycle; however, >99% of isolates are of the α mating type. Interestingly, α cells can undergo α-α unisexual reproduction, even involving genotypically identical cells. A central question is: Why would cells mate with themselves given that sex is costly and typically serves to admix preexisting genetic diversity from genetically divergent parents? In this study, we demonstrate that α-α unisexual reproduction frequently generates phenotypic diversity, and the majority of these variant progeny are aneuploid. Aneuploidy is responsible for the observed phenotypic changes, as chromosome loss restoring euploidy results in a wild-type phenotype. Other genetic changes, including diploidization, chromosome length polymorphisms, SNPs, and indels, were also generated. Phenotypic/genotypic changes were not observed following asexual mitotic reproduction. Aneuploidy was also detected in progeny from a-α opposite-sex congenic mating; thus, both homothallic and heterothallic sexual reproduction can generate phenotypic diversity de novo. Our study suggests that the ability to undergo unisexual reproduction may be an evolutionary strategy for eukaryotic microbial pathogens, enabling de novo genotypic and phenotypic plasticity and facilitating rapid adaptation to novel environments. 相似文献
11.
12.
13.
14.
Guojie Cao Jianghong Meng Errol Strain Robert Stones James Pettengill Shaohua Zhao Patrick McDermott Eric Brown Marc Allard 《PloS one》2013,8(2)
Salmonella Newport has ranked in the top three Salmonella serotypes associated with foodborne outbreaks from 1995 to 2011 in the United States. In the current study, we selected 26 S. Newport strains isolated from diverse sources and geographic locations and then conducted 454 shotgun pyrosequencing procedures to obtain 16–24 × coverage of high quality draft genomes for each strain. Comparative genomic analysis of 28 S. Newport strains (including 2 reference genomes) and 15 outgroup genomes identified more than 140,000 informative SNPs. A resulting phylogenetic tree consisted of four sublineages and indicated that S. Newport had a clear geographic structure. Strains from Asia were divergent from those from the Americas. Our findings demonstrated that analysis using whole genome sequencing data resulted in a more accurate picture of phylogeny compared to that using single genes or small sets of genes. We selected loci around the mutS gene of S. Newport to differentiate distinct lineages, including those between invH and mutS genes at the 3′ end of Salmonella Pathogenicity Island 1 (SPI-1), ste fimbrial operon, and Clustered, Regularly Interspaced, Short Palindromic Repeats (CRISPR) associated-proteins (cas). These genes in the outgroup genomes held high similarity with either S. Newport Lineage II or III at the same loci. S. Newport Lineages II and III have different evolutionary histories in this region and our data demonstrated genetic flow and homologous recombination events around mutS. The findings suggested that S. Newport Lineages II and III diverged early in the serotype evolution and have evolved largely independently. Moreover, we identified genes that could delineate sublineages within the phylogenetic tree and that could be used as potential biomarkers for trace-back investigations during outbreaks. Thus, whole genome sequencing data enabled us to better understand the genetic background of pathogenicity and evolutionary history of S. Newport and also provided additional markers for epidemiological response. 相似文献
15.
16.
17.
18.
Alexander C. Outhred Nadine Holmes Rosemarie Sadsad Elena Martinez Peter Jelfs Grant A. Hill-Cawthorne Gwendolyn L. Gilbert Ben J. Marais Vitali Sintchenko 《PloS one》2016,11(3)
Background
Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways.Methods
We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed repetitive unit typing. Following high-depth paired-end sequencing using the Illumina HiSeq 2000 platform, two independent pipelines were employed for analysis, both employing read mapping onto reference genomes as well as de novo assembly, to control biases in variant detection. In addition to single-nucleotide polymorphisms, the analyses also sought to identify insertions, deletions and structural variants.Results
Isolates were highly similar, with a distance of 13 variants between the most distant members of the cluster. The most sensitive analysis classified the 22 isolates into 18 groups. Four of the isolates did not appear to share a recent common ancestor with the largest clade; another four isolates had an uncertain ancestral relationship with the largest clade.Conclusion
Whole genome sequencing, with analysis of single-nucleotide polymorphisms, insertions, deletions, structural variants and subpopulations, enabled the highest possible level of discrimination between cluster members, clarifying likely transmission pathways and exposing the complexity of strain origin. The analysis provides a basis for targeted public health intervention and enhanced classification of future isolates linked to the cluster. 相似文献19.