期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The Genome Sequence of Lone Star Virus,a Highly Divergent Bunyavirus Found in the Amblyomma americanum Tick

Andrea Swei Brandy J. Russell Samia N. Naccache Beniwende Kabre Narayanan Veeraraghavan Mark A. Pilgard Barbara J. B. Johnson Charles Y. Chiu 《PloS one》2013,8(4)

Viruses in the family Bunyaviridae infect a wide range of plant, insect, and animal hosts. Tick-borne bunyaviruses in the Phlebovirus genus, including Severe Fever with Thrombocytopenia Syndrome virus (SFTSV) in China, Heartland virus (HRTV) in the United States, and Bhanja virus in Eurasia and Africa have been associated with acute febrile illness in humans. Here we sought to characterize the growth characteristics and genome of Lone Star virus (LSV), an unclassified bunyavirus originally isolated from the lone star tick Amblyomma americanum. LSV was able to infect both human (HeLa) and monkey (Vero) cells. Cytopathic effects were seen within 72 h in both cell lines; vacuolization was observed in infected Vero, but not HeLa, cells. Viral culture supernatants were examined by unbiased deep sequencing and analysis using an in-house developed rapid computational pipeline for viral discovery, which definitively identified LSV as a phlebovirus. De novo assembly of the full genome revealed that LSV is highly divergent, sharing <61% overall amino acid identity with any other bunyavirus. Despite this sequence diversity, LSV was found by phylogenetic analysis to be part of a well-supported clade that includes members of the Bhanja group viruses, which are most closely related to SFSTV/HRTV. The genome sequencing of LSV is a critical first step in developing diagnostic tools to determine the risk of arbovirus transmission by A. americanum, a tick of growing importance given its expanding geographic range and competence as a disease vector. This study also underscores the power of deep sequencing analysis in rapidly identifying and sequencing the genomes of viruses of potential clinical and public health significance. 相似文献

2.

VIpower: Simulation-based tool for estimating power of viral integration detection via high-throughput sequencing

《Genomics》2020,112(1):207-211

Viral sequence integrations in the human genome have been implicated in various human diseases. Viral integrations remain among the most challenging-to-detect structural changes of the human genome. No studies have systematically analyzed how molecular and bioinformatics factors affect the power (sensitivity) to detect viral integrations using high-throughput sequencing (HTS). We selected a wide-range of molecular and bioinformatics factors covering genome sequence characteristics, HTS features, and viral integration detection. We designed a fast simulation-based framework to model the process of detecting variable viral integration events in the human genome. We then examined the associations of selected factors with viral integration detection power. We identified six factors that significantly affected viral integration detection power (P < 2 × 10⁻¹⁶). The strongest factors associated with detection power included proportion of sample cells with clonal viral integrations (Pearson's ρ = 0.64), sequencing depth (ρ = 0.37), length of viral integration (ρ = 0.37), paired-end read insert size (ρ = 0.23), user-defined threshold (number of supporting reads) to claim successful identification of integrations (ρ = −0.19), and read length (when sequence volume was fixed) (ρ = −0.09). As the first tool of its kind, VIpower incorporates all these factors, which can be manipulated in concert with each other to optimize the detection power. This tool may be used to estimate viral integration detection power for various combinations of sequencing or analytic parameters. It may also be used to estimate the parameters required to achieve a specific power when designing new sequencing experiments. 相似文献

3.

Tips and tricks for the assembly of a Corynebacterium pseudotuberculosis genome using a semiconductor sequencer

Rommel Thiago Jucá Ramos Adriana Ribeiro Carneiro Siomar de Castro Soares Anderson Rodrigues dos Santos Sintia Almeida Luis Guimar?es Flávia Figueira Eudes Barbosa Andreas Tauch Vasco Azevedo Artur Silva 《Microbial biotechnology》2013,6(2):150-156

New sequencing platforms have enabled rapid decoding of complete prokaryotic genomes at relatively low cost. The Ion Torrent platform is an example of these technologies, characterized by lower coverage, generating challenges for the genome assembly. One particular problem is the lack of genomes that enable reference-based assembly, such as the one used in the present study, Corynebacterium pseudotuberculosis biovar equi, which causes high economic losses in the US equine industry. The quality treatment strategy incorporated into the assembly pipeline enabled a 16-fold greater use of the sequencing data obtained compared with traditional quality filter approaches. Data preprocessing prior to the de novo assembly enabled the use of known methodologies in the next-generation sequencing data assembly. Moreover, manual curation was proved to be essential for ensuring a quality assembly, which was validated by comparative genomics with other species of the genus Corynebacterium. The present study presents a modus operandi that enables a greater and better use of data obtained from semiconductor sequencing for obtaining the complete genome from a prokaryotic microorganism, C. pseudotuberculosis, which is not a traditional biological model such as Escherichia coli. 相似文献

4.

De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora,a Model Organism for Fungal Morphogenesis

Minou Nowrousian Jason E. Stajich Meiling Chu Ines Engh Eric Espagne Karen Halliday Jens Kamerewerd Frank Kempken Birgit Knab Hsiao-Che Kuo Heinz D. Osiewacz Stefanie P?ggeler Nick D. Read Stephan Seiler Kristina M. Smith Denise Zickler Ulrich Kück Michael Freitag 《PLoS genetics》2010,6(4)

相似文献

5.

Breakpoint mapping of a novel de novo translocation t(X;20)(q11.1;p13) by positional cloning and long read sequencing

Usha R. Dutta Sudha N. Rao Vijaya Kumar Pidugu Vineeth V.S. Amrita Bhattacherjee Aneek Das Bhowmik Sathish K. Ramaswamy Kumar Gautam Singh Ashwin Dalal 《Genomics》2019,111(5):1108-1114

Disease associated chromosomal rearrangements often have break points located within disease causing genes or in their vicinity. The purpose of this study is to characterize a balanced reciprocal translocation in a girl with intellectual disability and seizures by positional cloning and whole genome sequencing. The translocation was identification by G- banding and confirmed by WCP FISH. Fine mapping using BAC clones and whole genome sequencing using Oxford nanopore long read sequencing technology for a 1.46 X coverage of the genome was done. The positional cloning showed split signals with BAC RP11-943 J20. Long read sequencing analysis of chimeric reads carrying parts of chromosomes X and 20 helped to identify the breakpoints to be in intron 2 of ARHGEF9 gene on Xp11.1 and on 20p13 between RASSF2 and SLC23A2 genes. This is the first report of translocation which successfully delineated to single base resolution using Nanopore sequencing. The genotype-phenotype correlation is discussed. 相似文献

6.

Syncephalastrum contaminatum,a new species in the Mucorales from Australia

《Mycoscience》2020,61(3):111-115

A new species is described in the Mucorales family Syncephalastraceae: Syncephalastrum contaminatum, isolated as an in vitro culture from a laboratory contaminant. The species has variable copies of the internal transcribed spacer (ITS) regions, requiring cloning of these regions prior to Sanger sequencing before subsequent use in phylogenetic comparisons with other fungi. The genome of the strain was sequenced using short paired-reads to yield a draft genome of 28.6 Mb. Syncephalastrum contaminatum is distinguished by diverse DNA sequences at several loci from the other species of Syncephalastrum, including only 81% sequence identity with its ITS regions to that of S. racemosum. Its merosporangium produces four or more asexual spores and the genome sequencing information suggests that the species is heterothallic. The identification of this species highlights the limited knowledge about the early lineages of fungi both in Australia and globally. 相似文献

7.

Protein interaction mapping on a functional shotgun sequence of Rickettsia sibirica 总被引：4，自引：1，他引：3

下载免费PDF全文

Malek JA Wierzbowski JM Tao W Bosak SA Saranga DJ Doucette-Stamm L Smith DR McEwan PJ McKernan KJ 《Nucleic acids research》2004,32(3):1059-1064

Protein interaction maps can reveal novel pathways and functional complexes, allowing ‘guilt by association’ annotation of uncharacterized proteins. To address the need for large-scale protein interaction analyses, a bacterial two-hybrid system was coupled with a whole genome shotgun sequencing approach for microbial genome analysis. We report the first large-scale proteomics study using this system, integrating de novo genome sequencing with functional interaction mapping and annotation in a high-throughput format. We apply the approach by shotgun sequencing and annotating the genome of Rickettsia sibirica strain 246, an obligate intracellular human pathogen among the Spotted Fever Group rickettsiae. The bacteria invade endothelial cells and cause lysis after large amounts of progeny have accumulated. Little is known about specific Rickettsial virulence factors and their mode of pathogenicity. Analysis of the combined genomic sequence and protein–protein interaction data for a set of virulence related Type IV secretion system (T4SS) proteins revealed over 250 interactions and will provide insight into the mechanism of Rickettsial pathogenicity. 相似文献

8.

Fast assembly of the mitochondrial genome of a plant parasitic nematode (Meloidogyne graminicola) using next generation sequencing

Guillaume Besnard Frank Jühling Élodie Chapuis Loubab Zedane Émeline Lhuillier Thierry Mateille Stéphane Bellafiore 《Comptes rendus biologies》2014,337(5):295-301

Little is known about the variations of nematode mitogenomes (mtDNA). Sequencing a complete mtDNA using a PCR approach remains a challenge due to frequent genome reorganizations and low sequence similarities between divergent nematode lineages. Here, a genome skimming approach based on HiSeq sequencing (shotgun) was used to assemble de novo the first complete mtDNA sequence of a root-knot nematode (Meloidogyne graminicola). An AT-rich genome (84.3%) of 20,030 bp was obtained with a mean sequencing depth superior to 300. Thirty-six genes were identified with a semi-automated approach. A comparison with a gene map of the M. javanica mitochondrial genome indicates that the gene order is conserved within this nematode lineage. However, deep genome rearrangements were observed when comparing with other species of the superfamily Hoplolaimoidea. Repeat elements of 111 bp and 94 bp were found in a long non-coding region of 7.5 kb, as similarly reported in M. javanica and M. hapla. This study points out the power of next generation sequencing to produce complete mitochondrial genomes, even without a reference sequence, and possibly opening new avenues for species/race identification, phylogenetics and population genetics of nematodes. 相似文献

9.

A Novel Genome-Wide Association Study Approach Using Genotyping by Exome Sequencing Leads to the Identification of a Primary Open Angle Glaucoma Associated Inversion Disrupting ADAMTS17

Oliver P. Forman Louise Pettitt András M. Komáromy Peter Bedford Cathryn Mellersh 《PloS one》2015,10(12)

相似文献

10.

Genome Sequencing of Idiopathic Pulmonary Fibrosis in Conjunction with a Medical School Human Anatomy Course

Akash Kumar Max Dougherty Gregory M. Findlay Madeleine Geisheker Jason Klein John Lazar Heather Machkovech Jesse Resnick Rebecca Resnick Alexander I. Salter Faezeh Talebi-Liasi Christopher Arakawa Jacob Baudin Andrew Bogaard Rebecca Salesky Qian Zhou Kelly Smith John I. Clark Jay Shendure Marshall S. Horwitz 《PloS one》2014,9(9)

Even in cases where there is no obvious family history of disease, genome sequencing may contribute to clinical diagnosis and management. Clinical application of the genome has not yet become routine, however, in part because physicians are still learning how best to utilize such information. As an educational research exercise performed in conjunction with our medical school human anatomy course, we explored the potential utility of determining the whole genome sequence of a patient who had died following a clinical diagnosis of idiopathic pulmonary fibrosis (IPF). Medical students performed dissection and whole genome sequencing of the cadaver. Gross and microscopic findings were more consistent with the fibrosing variant of nonspecific interstitial pneumonia (NSIP), as opposed to IPF per se. Variants in genes causing Mendelian disorders predisposing to IPF were not detected. However, whole genome sequencing identified several common variants associated with IPF, including a single nucleotide polymorphism (SNP), rs35705950, located in the promoter region of the gene encoding mucin glycoprotein MUC5B. The MUC5B promoter polymorphism was recently found to markedly elevate risk for IPF, though a particular association with NSIP has not been previously reported, nor has its contribution to disease risk previously been evaluated in the genome-wide context of all genetic variants. We did not identify additional predicted functional variants in a region of linkage disequilibrium (LD) adjacent to MUC5B, nor did we discover other likely risk-contributing variants elsewhere in the genome. Whole genome sequencing thus corroborates the association of rs35705950 with MUC5B dysregulation and interstitial lung disease. This novel exercise additionally served a unique mission in bridging clinical and basic science education. 相似文献

11.

Whole Genome Complete Resequencing of Bacillus subtilis Natto by Combining Long Reads with High-Quality Short Reads

Mayumi Kamada Sumitaka Hase Kengo Sato Atsushi Toyoda Asao Fujiyama Yasubumi Sakakibara 《PloS one》2014,9(10)

De novo microbial genome sequencing reached a turning point with third-generation sequencing (TGS) platforms, and several microbial genomes have been improved by TGS long reads. Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and it has a function in the production of the traditional Japanese fermented food “natto.” The B. subtilis natto BEST195 genome was previously sequenced with short reads, but it included some incomplete regions. We resequenced the BEST195 genome using a PacBio RS sequencer, and we successfully obtained a complete genome sequence from one scaffold without any gaps, and we also applied Illumina MiSeq short reads to enhance quality. Compared with the previous BEST195 draft genome and Marburg 168 genome, we found that incomplete regions in the previous genome sequence were attributed to GC-bias and repetitive sequences, and we also identified some novel genes that are found only in the new genome. 相似文献

12.

Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study

Cerdeira LT Carneiro AR Ramos RT de Almeida SS D'Afonseca V Schneider MP Baumbach J Tauch A McCulloch JA Azevedo VA Silva A 《Journal of microbiological methods》2011,86(2):218-223

Due to the advent of the so-called Next-Generation Sequencing (NGS) technologies the amount of monetary and temporal resources for whole-genome sequencing has been reduced by several orders of magnitude. Sequence reads can be assembled either by anchoring them directly onto an available reference genome (classical reference assembly), or can be concatenated by overlap (de novo assembly). The latter strategy is preferable because it tends to maintain the architecture of the genome sequence the however, depending on the NGS platform used, the shortness of read lengths cause tremendous problems the in the subsequent genome assembly phase, impeding closing of the entire genome sequence. To address the problem, we developed a multi-pronged hybrid de novo strategy combining De Bruijn graph and Overlap-Layout-Consensus methods, which was used to assemble from short reads the entire genome of Corynebacterium pseudotuberculosis strain I19, a bacterium with immense importance in veterinary medicine that causes Caseous Lymphadenitis in ruminants, principally ovines and caprines. Briefly, contigs were assembled de novo from the short reads and were only oriented using a reference genome by anchoring. Remaining gaps were closed using iterative anchoring of short reads by craning to gap flanks. Finally, we compare the genome sequence assembled using our hybrid strategy to a classical reference assembly using the same data as input and show that with the availability of a reference genome, it pays off to use the hybrid de novo strategy, rather than a classical reference assembly, because more genome sequences are preserved using the former. 相似文献

13.

Revising a Personal Genome by Comparing and Combining Data from Two Different Sequencing Platforms

Deokhoon Kim Woo-Yeon Kim Sun-Young Lee Sung-Yeoun Lee Hongseok Yun Soo-Yong Shin Jungyoun Lee Yoojin Hong Youngmi Won Seong-Jin Kim Yong Seok Lee Sung-Min Ahn 《PloS one》2013,8(4)

For the robust practice of genomic medicine, sequencing results must be compatible, regardless of the sequencing technologies and algorithms used. Presently, genome sequencing is still an imprecise science and is complicated by differences in the chemistry, coverage, alignment, and variant-calling algorithms. We identified ∼3.33 million single nucleotide variants (SNVs) and ∼3.62 million SNVs in the SJK genome using SOLiD and Illumina data, respectively. Approximately 3 million SNVs were concordant between the two platforms while 68,532 SNVs were discordant; 219,616 SNVs were SOLiD-specific and 516,080 SNVs were Illumina-specific (i.e., platform-specific). Concordant, discordant, and platform-specific SNVs were further analyzed and characterized. Overall, a large portion of heterozygous SNVs that were discordant with genotyping calls of single nucleotide polymorphism chips were highly confident. Approximately 70% of the platform-specific SNVs were located in regions containing repetitive sequences. Such platform-specificity may arise from differences between platforms, with regard to read length (36 bp and 72 bp vs. 50 bp), insert size (∼100–300 bp vs. ∼1–2 kb), sequencing chemistry (sequencing-by-synthesis using single nucleotides vs. ligation-based sequencing using oligomers), and sequencing quality. When data from the two platforms were merged for variant calling, the proportion of callable regions of the reference genome increased to 99.66%, which was 1.43% higher than the average callability of the two platforms, representing ∼40 million bases. In this study, we compared the differences in sequencing results between two sequencing platforms. Approximately 90% of the SNVs were concordant between the two platforms, yet ∼10% of the SNVs were either discordant or platform-specific, indicating that each platform had its own strengths and weaknesses. When data from the two platforms were merged, both the overall callability of the reference genome and the overall accuracy of the SNVs improved, demonstrating the likelihood that a re-sequenced genome can be revised using complementary data. 相似文献

14.

Whole genome sequences of Treponema pallidum subsp. endemicum isolated from Cuban patients: The non-clonal character of isolates suggests a persistent human infection rather than a single outbreak

Eli&#x;ka Vrbov Angel A. Noda Linda Grillov Islay Rodríguez Allyn Forsyth Jan Oppelt David &#x;majs 《PLoS neglected tropical diseases》2022,16(6)

Bejel (endemic syphilis) is a neglected non-venereal disease caused by Treponema pallidum subsp. endemicum (TEN). Although it is mostly present in hot, dry climates, a few cases have been found outside of these areas. The aim of this work was the sequencing and analysis of TEN isolates obtained from “syphilis patients” in Cuba, which is not considered an endemic area for bejel. Genomes were obtained by pool segment genome sequencing or direct sequencing methods, and the bioinformatics analysis was performed according to an established pipeline. We obtained four genomes with 100%, 81.7%, 52.6%, and 21.1% breadth of coverage, respectively. The sequenced genomes revealed a non-clonal character, with nucleotide variability ranging between 0.2–10.3 nucleotide substitutions per 100 kbp among the TEN isolates. Nucleotide changes affected 27 genes, and the analysis of the completely sequenced genome also showed a recombination event between tprC and tprI, in TP0488 as well as in the intergenic region between TP0127–TP0129. Despite limitations in the quality of samples affecting breadth of sequencing coverage, the determined non-clonal character of the isolates suggests a persistent infection in the Cuban population rather than a single outbreak caused by imported case. 相似文献

15.

Chromosome-level genome assembly of a xerophytic plant,Haloxylon ammodendron

Mingcheng Wang Lei Zhang Shaofei Tong Dechun Jiang Zhixi Fu 《DNA research》2022,29(2)

Haloxylon ammodendron is a xerophytic perennial shrub or small tree that has a high ecological value in anti-desertification due to its high tolerance to drought and salt stress. Here, we report a high-quality, chromosome-level genome assembly of H. ammodendron by integrating PacBio’s high-fidelity sequencing and Hi-C technology. The assembled genome size was 685.4 Mb, of which 99.6% was assigned to nine pseudochromosomes with a contig N50 value of 23.6 Mb. Evolutionary analysis showed that both the recent substantial amplification of long terminal repeat retrotransposons and tandem gene duplication may have contributed to its genome size expansion and arid adaptation. An ample amount of low-GC genes was closely related to functions that may contribute to the desert adaptation of H. ammodendron. Gene family clustering together with gene expression analysis identified differentially expressed genes that may play important roles in the direct response of H. ammodendron to water-deficit stress. We also identified several genes possibly related to the degraded scaly leaves and well-developed root system of H. ammodendron. The reference-level genome assembly presented here will provide a valuable genomic resource for studying the genome evolution of xerophytic plants, as well as for further genetic breeding studies of H. ammodendron. 相似文献

16.

Linkage mapping and comparative genomics using next-generation RAD sequencing of a non-model organism

Baxter SW Davey JW Johnston JS Shelton AM Heckel DG Jiggins CD Blaxter ML 《PloS one》2011,6(4):e19315

Restriction-site associated DNA (RAD) sequencing is a powerful new method for targeted sequencing across the genomes of many individuals. This approach has broad potential for genetic analysis of non-model organisms including genotype-phenotype association mapping, phylogeography, population genetics and scaffolding genome assemblies through linkage mapping. We constructed a RAD library using genomic DNA from a Plutella xylostella (diamondback moth) backcross that segregated for resistance to the insecticide spinosad. Sequencing of 24 individuals was performed on a single Illumina GAIIx lane (51 base paired-end reads). Taking advantage of the lack of crossing over in homologous chromosomes in female Lepidoptera, 3,177 maternally inherited RAD alleles were assigned to the 31 chromosomes, enabling identification of the spinosad resistance and W/Z sex chromosomes. Paired-end reads for each RAD allele were assembled into contigs and compared to the genome of Bombyx mori (n = 28) using BLAST, revealing 28 homologous matches plus 3 expected fusion/breakage events which account for the difference in chromosome number. A genome-wide linkage map (1292 cM) was inferred with 2,878 segregating RAD alleles inherited from the backcross father, producing chromosome and location specific sequenced RAD markers. Here we have used RAD sequencing to construct a genetic linkage map de novo for an organism that has no previous genome data. Comparative analysis of P. xyloxtella linkage groups with B. mori chromosomes shows for the first time, genetic synteny appears common beyond the Macrolepidoptera. RAD sequencing is a powerful system capable of rapidly generating chromosome specific data for non-model organisms. 相似文献

17.

Two Low Coverage Bird Genomes and a Comparison of Reference-Guided versus De Novo Genome Assemblies

Daren C. Card Drew R. Schield Jacobo Reyes-Velasco Matthew K. Fujita Audra L. Andrew Sara J. Oyler-McCance Jennifer A. Fike Diana F. Tomback Robert P. Ruggiero Todd A. Castoe 《PloS one》2014,9(9)

As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (∼3.5–5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark''s Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies. 相似文献

18.

Sequencing of the Dutch Elm Disease Fungus Genome Using the Roche/454 GS-FLX Titanium System in a Comparison of Multiple Genomics Core Facilities

Vincenzo Forgetta Gary Leveque Joana Dias Deborah Grove Robert Lyons Jr. Suzanne Genik Chris Wright Sushmita Singh Nichole Peterson Michael Zianni Jan Kieleczawa Robert Steen Anoja Perera Doug Bintzler Scottie Adams Will Hintz Volker Jacobi Louis Bernier Roger Levesque Ken Dewar 《Journal of biomolecular techniques》2013,24(1):39-49

As part of the DNA Sequencing Research Group of the Association of Biomolecular Resource Facilities, we have tested the reproducibility of the Roche/454 GS-FLX Titanium System at five core facilities. Experience with the Roche/454 system ranged from <10 to >340 sequencing runs performed. All participating sites were supplied with an aliquot of a common DNA preparation and were requested to conduct sequencing at a common loading condition. The evaluation of sequencing yield and accuracy metrics was assessed at a single site. The study was conducted using a laboratory strain of the Dutch elm disease fungus Ophiostoma novo-ulmi strain H327, an ascomycete, vegetatively haploid fungus with an estimated genome size of 30–50 Mb. We show that the Titanium System is reproducible, with some variation detected in loading conditions, sequencing yield, and homopolymer length accuracy. We demonstrate that reads shorter than the theoretical minimum length are of lower overall quality and not simply truncated reads. The O. novo-ulmi H327 genome assembly is 31.8 Mb and is comprised of eight chromosome-length linear scaffolds, a circular mitochondrial conti of 66.4 kb, and a putative 4.2-kb linear plasmid. We estimate that the nuclear genome encodes 8613 protein coding genes, and the mitochondrion encodes 15 genes and 26 tRNAs. 相似文献

19.

Evaluation of Genome Sequencing Quality in Selected Plant Species Using Expressed Sequence Tags

Lingfei Shangguan Jian Han Emrul Kayesh Xin Sun Changqing Zhang Tariq Pervaiz Xicheng Wen Jinggui Fang 《PloS one》2013,8(7)

Background

With the completion of genome sequencing projects for more than 30 plant species, large volumes of genome sequences have been produced and stored in online databases. Advancements in sequencing technologies have reduced the cost and time of whole genome sequencing enabling more and more plants to be subjected to genome sequencing. Despite this, genome sequence qualities of multiple plants have not been evaluated.

Methodology/Principal Finding

Integrity and accuracy were calculated to evaluate the genome sequence quality of 32 plants. The integrity of a genome sequence is presented by the ratio of chromosome size and genome size (or between scaffold size and genome size), which ranged from 55.31% to nearly 100%. The accuracy of genome sequence was presented by the ratio between matched EST and selected ESTs where 52.93% ∼ 98.28% and 89.02% ∼ 98.85% of the randomly selected clean ESTs could be mapped to chromosome and scaffold sequences, respectively. According to the integrity, accuracy and other analysis of each plant species, thirteen plant species were divided into four levels. Arabidopsis thaliana, Oryza sativa and Zea mays had the highest quality, followed by Brachypodium distachyon, Populus trichocarpa, Vitis vinifera and Glycine max, Sorghum bicolor, Solanum lycopersicum and Fragaria vesca, and Lotus japonicus, Medicago truncatula and Malus × domestica in that order. Assembling the scaffold sequences into chromosome sequences should be the primary task for the remaining nineteen species. Low GC content and repeat DNA influences genome sequence assembly.

Conclusion

The quality of plant genome sequences was found to be lower than envisaged and thus the rapid development of genome sequencing projects as well as research on bioinformatics tools and the algorithms of genome sequence assembly should provide increased processing and correction of genome sequences that have already been published. 相似文献

20.

Enhanced De Novo Assembly of High Throughput Pyrosequencing Data Using Whole Genome Mapping

Fatma Onmus-Leone Jun Hang Robert J. Clifford Yu Yang Matthew C. Riley Robert A. Kuschner Paige E. Waterman Emil P. Lesho 《PloS one》2013,8(4)

Despite major advances in next-generation sequencing, assembly of sequencing data, especially data from novel microorganisms or re-emerging pathogens, remains constrained by the lack of suitable reference sequences. De novo assembly is the best approach to achieve an accurate finished sequence, but multiple sequencing platforms or paired-end libraries are often required to achieve full genome coverage. In this study, we demonstrated a method to assemble complete bacterial genome sequences by integrating shotgun Roche 454 pyrosequencing with optical whole genome mapping (WGM). The whole genome restriction map (WGRM) was used as the reference to scaffold de novo assembled sequence contigs through a stepwise process. Large de novo contigs were placed in the correct order and orientation through alignment to the WGRM. De novo contigs that were not aligned to WGRM were merged into scaffolds using contig branching structure information. These extended scaffolds were then aligned to the WGRM to identify the overlaps to be eliminated and the gaps and mismatches to be resolved with unused contigs. The process was repeated until a sequence with full coverage and alignment with the whole genome map was achieved. Using this method we were able to achieved 100% WGRM coverage without a paired-end library. We assembled complete sequences for three distinct genetic components of a clinical isolate of Providencia stuartii: a bacterial chromosome, a novel bla _NDM-1 plasmid, and a novel bacteriophage, without separately purifying them to homogeneity. 相似文献