首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
SARS-CoV Genome Polymorphism: A Bioinformatics Study   总被引:3,自引:0,他引:3  
A dataset of 103 SARS-CoV isolates (101 human patients and 2 palm civets) was investigated on different aspects of genome polymorphism and isolate classification. The number and the distribution of single nucleotide variations (SNVs) and insertions and deletions, with respect to a “profile”, were determined and discussed (“profile” being a sequence containing the most represented letter per position). Distribution of substitution categories per codon positions, as well as synonymous and non-synonymous substitutions in coding regions of annotated isolates, was determined, along with amino acid (a.a.) property changes. Similar analysis was performed for the spike (S) protein in all the isolates (55 of them being predicted for the first time). The ratio Ka/Ks confirmed that the S gene was subjected to the Darwinian selection during virus transmission from animals to humans. Isolates from the dataset were classified according to genome polymorphism and genotypes. Genome polymorphism yields to two groups, one with a small number of SNVs and another with a large number of SNVs, with up to four subgroups with respect to insertions and deletions. We identified three basic nine-locus genotypes: TTTT/TTCGG, CGCC/TTCAT, and TGCC/TTCGT, with four subgenotypes. Both classifications proposed are in accordance with the new insights into possible epidemiological spread, both in space and time.  相似文献   

2.

Background

The very recent availability of fully sequenced individual human genomes is a major revolution in biology which is certainly going to provide new insights into genetic diseases and genomic rearrangements.

Results

We mapped the insertions, deletions and SNPs (single nucleotide polymorphisms) that are present in Craig Venter''s genome, more precisely on chromosomes 17 to 22, and compared them with the human reference genome hg17. Our results show that insertions and deletions are almost absent in L1 and generally scarce in L2 isochore families (GC-poor L1+L2 isochores represent slightly over half of the human genome), whereas they increase in GC-rich isochores, largely paralleling the densities of genes, retroviral integrations and Alu sequences. The distributions of insertions/deletions are in striking contrast with those of SNPs which exhibit almost the same density across all isochore families with, however, a trend for lower concentrations in gene-rich regions.

Conclusions

Our study strongly suggests that the distribution of insertions/deletions is due to the structure of chromatin which is mostly open in gene-rich, GC-rich isochores, and largely closed in gene-poor, GC-poor isochores. The different distributions of insertions/deletions and SNPs are clearly related to the two different responsible mechanisms, namely recombination and point mutations.  相似文献   

3.

Background  

Insertions and deletions of DNA segments (indels) are together with substitutions the major mutational processes that generate genetic variation. Here we focus on recent DNA insertions and deletions in protein coding regions of the human genome to investigate selective constraints on indels in protein evolution.  相似文献   

4.

Background

Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways.

Methods

We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed repetitive unit typing. Following high-depth paired-end sequencing using the Illumina HiSeq 2000 platform, two independent pipelines were employed for analysis, both employing read mapping onto reference genomes as well as de novo assembly, to control biases in variant detection. In addition to single-nucleotide polymorphisms, the analyses also sought to identify insertions, deletions and structural variants.

Results

Isolates were highly similar, with a distance of 13 variants between the most distant members of the cluster. The most sensitive analysis classified the 22 isolates into 18 groups. Four of the isolates did not appear to share a recent common ancestor with the largest clade; another four isolates had an uncertain ancestral relationship with the largest clade.

Conclusion

Whole genome sequencing, with analysis of single-nucleotide polymorphisms, insertions, deletions, structural variants and subpopulations, enabled the highest possible level of discrimination between cluster members, clarifying likely transmission pathways and exposing the complexity of strain origin. The analysis provides a basis for targeted public health intervention and enhanced classification of future isolates linked to the cluster.  相似文献   

5.
6.

Background

Identifying pathogen virulence genes required to cause disease is crucial to understand the mechanisms underlying the pathogenic process. Plasmid insertion mutagenesis of fungal protoplasts is frequently used for this purpose in filamentous ascomycetes. Post transformation, the mutant population is screened for loss of virulence to a specific plant or animal host. Identifying the insertion event has previously met with varying degrees of success, from a cleanly disrupted gene with minimal deletion of nucleotides at the insertion point to multiple-copy insertion events and large deletions of chromosomal regions. Currently, extensive mutant collections exist in laboratories globally where it was hitherto impossible to identify all the affected genes.

Results

We used a whole-genome sequencing (WGS) approach using Illumina HiSeq 2000 technology to investigate DNA tag insertion points and chromosomal deletion events in mutagenised, reduced virulence F. graminearum isolates identified in disease tests on wheat (Triticum aestivum). We developed the FindInsertSeq workflow to localise the DNA tag insertions to the nucleotide level. The workflow was tested using four mutants showing evidence of single and multi-copy insertions in DNA blot analysis. FindInsertSeq was able to identify both single and multi-copy concatenation insertion sites. By comparing sequencing coverage, unexpected molecular recombination events such as large tagged and untagged chromosomal deletions, and DNA amplification were observed in three of the analysed mutants. A random data sampling approach revealed the minimum genome coverage required to survey the F. graminearum genome for alterations.

Conclusions

This study demonstrates that whole-genome re-sequencing to 22x fold genome coverage is an efficient tool to characterise single and multi-copy insertion mutants in the filamentous ascomycete Fusarium graminearum. In some cases insertion events are accompanied with large untagged chromosomal deletions while in other cases a straight-forward insertion event could be confirmed. The FindInsertSeq analysis workflow presented in this study enables researchers to efficiently characterise insertion and deletion mutants.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1412-9) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Haloquadratum walsbyi commonly dominates the microbial flora of hypersaline waters. Its cells are extremely fragile squares requiring >14%(w/v) salt for growth, properties that should limit its dispersal and promote geographical isolation and divergence. To assess this, the genome sequences of two isolates recovered from sites at near maximum distance on Earth, were compared.

Principal Findings

Both chromosomes are 3.1 MB in size, and 84% of each sequence was highly similar to the other (98.6% identity), comprising the core sequence. ORFs of this shared sequence were completely synteneic (conserved in genomic orientation and order), without inversion or rearrangement. Strain-specific insertions/deletions could be precisely mapped, often allowing the genetic events to be inferred. Many inferred deletions were associated with short direct repeats (4–20 bp). Deletion-coupled insertions are frequent, producing different sequences at identical positions. In cases where the inserted and deleted sequences are homologous, this leads to variant genes in a common synteneic background (as already described by others). Cas/CRISPR systems are present in C23T but have been lost in HBSQ001 except for a few spacer remnants. Numerous types of mobile genetic elements occur in both strains, most of which appear to be active, and with some specifically targetting others. Strain C23T carries two ∼6 kb plasmids that show similarity to halovirus His1 and to sequences nearby halovirus/plasmid gene clusters commonly found in haloarchaea.

Conclusions

Deletion-coupled insertions show that Hqr. walsbyi evolves by uptake and precise integration of foreign DNA, probably originating from close relatives. Change is also driven by mobile genetic elements but these do not by themselves explain the atypically low gene coding density found in this species. The remarkable genome conservation despite the presence of active systems for genome rearrangement implies both an efficient global dispersal system, and a high selective fitness for this species.  相似文献   

8.
In this work, severe acute respiratory syndrome associated coronavirus (SARS-CoV) genome BJ202 (AY864806) was completely sequenced. The genome was directly accessed from the stool sample of a patient in Beijing. Comparative genomics methods were used to analyze the sequence variations of 116 SARS-CoV genomes (including BJ202) available in the NCBI Gen-Bank. With the genome sequence of GZ02 as the reference, there were 41 polymorphic sites identified in BJ202 and a total of 278 polymorphic sites present in at least two of the 116 genomes. The distribution of the polymorphic sites was biased over the whole genome. Nearly half of the variations (50.4%, 140/278) clustered in the one third of the whole genome at the 3′ end (19.0 kb-29.7 kb). Regions encoding Orf10-11, Orf3/4, E, M and S protein had the highest mutation rates. A total of 15 PCR products (about 6.0 kb of the genome) including 11 fragments containing 12 known polymorphic sites and 4 fragments without identified polymorphic sites were cloned and sequenced. Results showed that 3 unique polymorphic sites of BJ202 (positions 13 804, 15 031 and 20 792) along with 3 other polymorphic sites (26 428, 26 477 and 27 243) all contained 2 kinds of nucleotides. It is interesting to find that position 18379 which has not been identified to be polymorphic in any of the other 115 published SARS-CoV genomes is actually a polymorphic site. The nucleotide composition of this site is A (8) to G (6). Among 116 SARS-CoV genomes, 18 types of deletions and 2 insertions were identified. Most of them were related to a 300 bp region (27 700-28 000) which encodes parts of the putative ORF9 and ORF10-11. A phylogenetic tree illustrating the divergence of whole BJ202 genome from 115 other completely sequenced SARS-CoVs was also constructed. BJ202 was phylogeneticly closer to BJ01 and LLJ-2004.  相似文献   

9.

Background  

Determining the position and order of contigs and scaffolds from a genome assembly within an organism's genome remains a technical challenge in a majority of sequencing projects. In order to exploit contemporary technologies for DNA sequencing, we developed a strategy for whole genome single nucleotide polymorphism sequencing allowing the positioning of sequence contigs onto a linkage map using the bin mapping method.  相似文献   

10.
Nucleotide insertions and deletions (indels) are responsible for gaps in the sequence alignments. Indel is one of the major sources of evolutionary change at the molecular level. We have examined the patterns of insertions and deletions in the 19 mammalian genomes, and found that deletion events are more common than insertions in the mammalian genomes. Both the number of insertions and deletions decrease rapidly when the gap length increases and single nucleotide indel is the most frequent in all indel events. The frequencies of both insertions and deletions can be described well by power law.Key Words: Insertion, deletion, gap, indel, mammalian genome.  相似文献   

11.

Background  

Amino acid insertions and deletions in proteins are considered relatively rare events, and their associations with the evolution and adaptation of organisms are not yet understood. In this study, we undertook a systematic analysis of over 214,000 polypeptides from 32 nematode species and identified insertions and deletions unique to nematode proteins in more than 1000 families and provided indirect evidence that these alterations are linked to the evolution and adaptation of nematodes.  相似文献   

12.
13.

Background

Salmonella Typhimurium is frequently isolated from foodborne infection cases in Hong Kong, but the lack of genome sequences has hindered in-depth epidemiological and phylogenetic studies. In this study, we sought to reconstruct the phylogenetic relationship and investigate the distribution and mutation patterns of virulence determinants among local S. Typhimurium clinical isolates using their genome sequences.

Results

We obtained genome sequences of 20 S. Typhimurium clinical isolates from a local hospital cluster using a 454 GS FLX Titanium sequencing platform. Phylogenetic analysis was performed based on single nucleotide polymorphism positions of the core genome against the reference strain LT2. Antimicrobial susceptibility was determined using minimal inhibitory concentration for five antimicrobial agents and analyses of virulence determinants were performed through referencing to various databases. Through phylogenetic analysis, we revealed two distinct clades of S. Typhimurium isolates and three outliers in Hong Kong, which differ remarkably in antimicrobial susceptibility and presentation and mutations of virulence determinants. The local isolates were not closely related to many of the previously sequenced S. Typhimurium isolates, except LT2. As the isolates in the two clades spanned over 10 years of isolation, they probably represent endemic strains. The outliers are possibly introduced from outside of Hong Kong. The close relatedness of members in one of the clades to LT2 and the Japanese stool isolate T000240 suggests the potential reemergence of LT2 progeny in regions nearby.

Conclusions

Our study demonstrated the utility of next-generation sequencing coupled to traditional microbiological testing method in a retrospective epidemiological study involving multiple clinical isolates. The evolution of multidrug- and ciprofloxacin-resistant strains among the more virulent clade is also an increasing concern.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1900-y) contains supplementary material, which is available to authorized users.  相似文献   

14.

Background  

The complexity of the wheat genome has resulted from waves of retrotransposable element insertions. Gene deletions and disruptions generated by the fast replacement of repetitive elements in wheat have resulted in disruption of colinearity at a micro (sub-megabase) level among the cereals. In view of genomic changes that are possible within a given time span, conservation of genes between species tends to imply an important functional or regional constraint that does not permit a change in genomic structure. The ctg1034 contig completed in this paper was initially studied because it was assigned to the Sr2 resistance locus region, but detailed mapping studies subsequently assigned it to the long arm of 3B and revealed its unusual features.  相似文献   

15.
对SARS病人粪便样本直接测序,得到SRAS—CoV BJ202全基因组序列(AY864806)。应用比较基因组研究方法对GenBank中公布的115株SARS—CoV基因组序列以及BJ202进行分析。以GZ02序列为参照,发现2个以上基因组中同时存在单核苷酸多态(SNP)位点共278个。多态位点在SARS—CoV基因组中呈偏态分布,大约一半突变位点(50.4%,140/278)发生在基因组3’末端1/3区域。编码Orf10-11、Orf3/4、E蛋白、M蛋白和S蛋白区域突变率较高。克隆并测序含有BJ202基因组12个多态位点的11个cDNA以及4个不含已知多态位点的cDNA片段(15个片段总长度为6.0kb),结果显示:BJ202特有的3个多态位点(13804、1503l和20792)以及另外3个多态位点(26428、26477和27243)均检出两种不同核苷酸;位点18379虽在已公布的115株SARS—CoV基因组中未发现突变,实际上也是多态位点。14个克隆中有8个克隆该位点为A,6个克隆为G。全部116个SARS—CoV基因组中共有18种缺失类型和2种插入类型。大部分缺失发生在编码ORF9和ORF10-11区域(基因组序列27700—28000bp处)。以邻位连接法(Neighbor-Joining)构建了116株SARS—CoV系统发育树,BJ202与BJ01和LLJ-2004等SARS—CoV的亲缘关系较接近。  相似文献   

16.

Background  

Alternative selection of splice sites in tandem donors and acceptors is a major mode of alternative splicing. Here, we analyzed whether in-frame tandem sites leading to subtle mRNA insertions/deletions of 3, 6, or 9 nucleotides are under natural selection.  相似文献   

17.
We studied mitochondrial DNA variability in 19 natural Neurospora crassa isolates and one wild-type isolate to examine evolution of these fungi and their mitochondrial DNA (mtDNA). We combined restriction endonuclease analysis of natural isolate mtDNA with DNA-DNA hybridization to cloned EcoR I fragments of a wild-type genome to discriminate between length mutations and site changes due to nucleotide substitution. Most variability was due to length mutations (insertions and deletions); genome size could vary 25% between pairs of isolates. Length-mutation distribution was not random, nor simply explained by the presence of coding versus noncoding regions. Restriction-site changes were few; the estimated amount of nucleotide substitution per nucleotide between the most divergent pair of isolates was 0.78%. Evolutionary relationships among isolates based on both types of mutations were compatible, and suggest that geographically distinct populations of mitochondrial DNA exist in the biological species, N. crassa. In contrast, no such correlation was shown by the previously determined distribution of nuclear heterokaryon incompatibility genes in the same isolates (Mylyk, 1975, 1976).  相似文献   

18.

Background  

The Affymetrix MitoChip v2.0 is an oligonucleotide tiling array for the resequencing of the human mitochondrial (mt) genome. For each of 16,569 nucleotide positions of the mt genome it holds two sets of four 25-mer probes each that match the heavy and the light strand of a reference mt genome and vary only at their central position to interrogate all four possible alleles. In addition, the MitoChip v2.0 carries alternative local context probes to account for known mtDNA variants. These probes have been neglected in most studies due to the lack of software for their automated analysis.  相似文献   

19.

Background  

The quality of progressive sequence alignments strongly depends on the accuracy of the individual pairwise alignment steps since gaps that are introduced at one step cannot be removed at later aggregation steps. Adjacent insertions and deletions necessarily appear in arbitrary order in pairwise alignments and hence form an unavoidable source of errors.  相似文献   

20.

Introduction

The association between severity of illness of children with osteomyelitis caused by Methicillin-resistant Staphylococcus aureus (MRSA) and genomic variation of the causative organism has not been previously investigated. The purpose of this study is to assess genomic heterogeneity among MRSA isolates from children with osteomyelitis who have diverse severity of illness.

Materials and Methods

Children with osteomyelitis were prospectively studied between 2010 and 2011. Severity of illness of the affected children was determined from clinical and laboratory parameters. MRSA isolates were analyzed with next generation sequencing (NGS) and optical mapping. Sequence data was used for multi-locus sequence typing (MLST), phylogenetic analysis by maximum likelihood (PAML), and identification of virulence genes and single nucleotide polymorphisms (SNP) relative to reference strains.

Results

The twelve children studied demonstrated severity of illness scores ranging from 0 (mild) to 9 (severe). All isolates were USA300, ST 8, SCC mec IVa MRSA by MLST. The isolates differed from reference strains by 2 insertions (40 Kb each) and 2 deletions (10 and 25 Kb) but had no rearrangements or copy number variations. There was a higher occurrence of virulence genes among study isolates when compared to the reference strains (p = 0.0124). There were an average of 11 nonsynonymous SNPs per strain. PAML demonstrated heterogeneity of study isolates from each other and from the reference strains.

Discussion

Genomic heterogeneity exists among MRSA isolates causing osteomyelitis among children in a single community. These variations may play a role in the pathogenesis of variation in clinical severity among these children.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号