期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

SIFT Indel: Predictions for the Functional Effects of Amino Acid Insertions/Deletions in Proteins

Jing Hu Pauline C. Ng 《PloS one》2013,8(10)

Indels in the coding regions of a gene can either cause frameshifts or amino acid insertions/deletions. Frameshifting indels are indels that have a length that is not divisible by 3 and subsequently cause frameshifts. Indels that have a length divisible by 3 cause amino acid insertions/deletions or block substitutions; we call these 3n indels. The new amino acid changes resulting from 3n indels could potentially affect protein function. Therefore, we construct a SIFT Indel prediction algorithm for 3n indels which achieves 82% accuracy, 81% sensitivity, 82% specificity, 82% precision, 0.63 MCC, and 0.87 AUC by 10-fold cross-validation. We have previously published a prediction algorithm for frameshifting indels. The rules for the prediction of 3n indels are different from the rules for the prediction of frameshifting indels and reflect the biological differences of these two different types of variations. SIFT Indel was applied to human 3n indels from the 1000 Genomes Project and the Exome Sequencing Project. We found that common variants are less likely to be deleterious than rare variants. The SIFT indel prediction algorithm for 3n indels is available at http://sift-dna.org/ 相似文献

2.

The genomic landscape of short insertion and deletion polymorphisms in the chicken (Gallus gallus) Genome: a high frequency of deletions in tandem duplicates 总被引：1，自引：0，他引：1

下载免费PDF全文

Brandström M Ellegren H 《Genetics》2007,176(3):1691-1701

It is increasingly recognized that insertions and deletions (indels) are an important source of genetic as well as phenotypic divergence and diversity. We analyzed length polymorphisms identified through partial (0.25x) shotgun sequencing of three breeds of domestic chicken made by the International Chicken Polymorphism Map Consortium. A data set of 140,484 short indel polymorphisms in unique DNA was identified after filtering for microsatellite structures. There was a significant excess of tandem duplicates at indel sites, with deletions of a duplicate motif outnumbering the generation of duplicates through insertion. Indel density was lower in microchromosomes than in macrochromosomes, in the Z chromosome than in autosomes, and in 100 bp of upstream sequence, 5'-UTR, and first introns than in intergenic DNA and in other introns. Indel density was highly correlated with single nucleotide polymorphism (SNP) density. The mean density of indels in pairwise sequence comparisons was 1.9 x 10(-4) indel events/bp, approximately 5% the density of SNPs segregating in the chicken genome. The great majority of indels involved a limited number of nucleotides (median 1 bp), with A-rich motifs being overrepresented at indel sites. The overrepresentation of deletions at tandem duplicates indicates that replication slippage in duplicate sequences is a common mechanism behind indel mutation. The correlation between indel and SNP density indicates common effects of mutation and/or selection on the occurrence of indels and point mutations. 相似文献

3.

Mining for single nucleotide polymorphisms and insertions / deletions in expressed sequence tag libraries of oil palm

Riju A Chandrasekar A Arunachalam V 《Bioinformation》2007,2(4):128-131

相似文献

4.

Pervasive indels and their evolutionary dynamics after the fish-specific genome duplication

Guo B Zou M Wagner A 《Molecular biology and evolution》2012,29(10):3005-3022

Insertions and deletions (indels) in protein-coding genes are important sources of genetic variation. Their role in creating new proteins may be especially important after gene duplication. However, little is known about how indels affect the divergence of duplicate genes. We here study thousands of duplicate genes in five fish (teleost) species with completely sequenced genomes. The ancestor of these species has been subject to a fish-specific genome duplication (FSGD) event that occurred approximately 350 Ma. We find that duplicate genes contain at least 25% more indels than single-copy genes. These indels accumulated preferentially in the first 40 my after the FSGD. A lack of widespread asymmetric indel accumulation indicates that both members of a duplicate gene pair typically experience relaxed selection. Strikingly, we observe a 30-80% excess of deletions over insertions that is consistent for indels of various lengths and across the five genomes. We also find that indels preferentially accumulate inside loop regions of protein secondary structure and in regions where amino acids are exposed to solvent. We show that duplicate genes with high indel density also show high DNA sequence divergence. Indel density, but not amino acid divergence, can explain a large proportion of the tertiary structure divergence between proteins encoded by duplicate genes. Our observations are consistent across all five fish species. Taken together, they suggest a general pattern of duplicate gene evolution in which indels are important driving forces of evolutionary change. 相似文献

5.

Patterns of Insertion and Deletion in Mammalian Genomes

Yanhui Fan Wenjuan Wang Guoji Ma Lijing Liang Qi Shi Shiheng Tao 《Current Genomics》2007,8(6):370-378

Nucleotide insertions and deletions (indels) are responsible for gaps in the sequence alignments. Indel is one of the major sources of evolutionary change at the molecular level. We have examined the patterns of insertions and deletions in the 19 mammalian genomes, and found that deletion events are more common than insertions in the mammalian genomes. Both the number of insertions and deletions decrease rapidly when the gap length increases and single nucleotide indel is the most frequent in all indel events. The frequencies of both insertions and deletions can be described well by power law.Key Words: Insertion, deletion, gap, indel, mammalian genome. 相似文献

6.

Fast and sensitive detection of indels induced by precise gene targeting

Zhang Yang Catharina Steentoft Camilla Hauge Lars Hansen Allan Lind Thomsen Francesco Niola Malene B. Vester-Christensen Morten Fr?din Henrik Clausen Hans H. Wandall Eric P. Bennett 《Nucleic acids research》2015,43(9):e59

The nuclease-based gene editing tools are rapidly transforming capabilities for altering the genome of cells and organisms with great precision and in high throughput studies. A major limitation in application of precise gene editing lies in lack of sensitive and fast methods to detect and characterize the induced DNA changes. Precise gene editing induces double-stranded DNA breaks that are repaired by error-prone non-homologous end joining leading to introduction of insertions and deletions (indels) at the target site. These indels are often small and difficult and laborious to detect by traditional methods. Here we present a method for fast, sensitive and simple indel detection that accurately defines indel sizes down to ±1 bp. The method coined IDAA for Indel Detection by Amplicon Analysis is based on tri-primer amplicon labelling and DNA capillary electrophoresis detection, and IDAA is amenable for high throughput analysis. 相似文献

7.

Vindel: a simple pipeline for checking indel redundancy

Zhiyi Li Xiaowei Wu Bin He Liqing Zhang 《BMC bioinformatics》2014,15(1)

Background

With the advance of next generation sequencing (NGS) technologies, a large number of insertion and deletion (indel) variants have been identified in human populations. Despite much research into variant calling, it has been found that a non-negligible proportion of the identified indel variants might be false positives due to sequencing errors, artifacts caused by ambiguous alignments, and annotation errors.

Results

In this paper, we examine indel redundancy in dbSNP, one of the central databases for indel variants, and develop a standalone computational pipeline, dubbed Vindel, to detect redundant indels. The pipeline first applies indel position information to form candidate redundant groups, then performs indel mutations to the reference genome to generate corresponding indel variant substrings. Finally the indel variant substrings in the same candidate redundant groups are compared in a pairwise fashion to identify redundant indels. We applied our pipeline to check for redundancy in the human indels in dbSNP. Our pipeline identified approximately 8% redundancy in insertion type indels, 12% in deletion type indels, and overall 10% for insertions and deletions combined. These numbers are largely consistent across all human autosomes. We also investigated indel size distribution and adjacent indel distance distribution for a better understanding of the mechanisms generating indel variants.

Conclusions

Vindel, a simple yet effective computational pipeline, can be used to check whether a set of indels are redundant with respect to those already in the database of interest such as NCBI’s dbSNP. Of the approximately 5.9 million indels we examined, nearly 0.6 million are redundant, revealing a serious limitation in the current indel annotation. Statistics results prove the consistency of the pipeline on indel redundancy detection for all 22 chromosomes. Apart from the standalone Vindel pipeline, the indel redundancy check algorithm is also implemented in the web server http://bioinformatics.cs.vt.edu/zhanglab/indelRedundant.php.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0359-1) contains supplementary material, which is available to authorized users. 相似文献

8.

The missing indels: an estimate of indel variation in a human genome and analysis of factors that impede detection

Yue Jiang Andrei L. Turinsky Michael Brudno 《Nucleic acids research》2015,43(15):7217-7228

相似文献

9.

The pattern of insertion/deletion polymorphism in Arabidopsis thaliana

Zhang W Sun X Yuan H Araki H Wang J Tian D 《Molecular genetics and genomics : MGG》2008,280(4):351-361

Little is known about variation of nucleotide insertion/deletions (indels) within species. In Arabidopsis thaliana, we investigated indel polymorphism patterns between two genome sequences and among 96 accessions at 1215 loci. Our study identified patterns in the variation of indel density, size, GC content and distribution, and a correlation between indels and substitutions. We found that the GC content in indel sequences was lower than that in non-indel sequences and that indels typically occur in regions with lower GC content. Patterns of indel frequency distribution among populations were more consistent with neutral expectation than substitution patterns. We also found that the local level of substitutions is positively correlated with indel density and negatively correlated with their distance to the closed indel, suggesting that indels play an important role in nucleotide variation. 相似文献

10.

Sequence alignments and pair hidden Markov models using evolutionary history

Knudsen B Miyamoto MM 《Journal of molecular biology》2003,333(2):453-460

This work presents a novel pairwise statistical alignment method based on an explicit evolutionary model of insertions and deletions (indels). Indel events of any length are possible according to a geometric distribution. The geometric distribution parameter, the indel rate, and the evolutionary time are all maximum likelihood estimated from the sequences being aligned. Probability calculations are done using a pair hidden Markov model (HMM) with transition probabilities calculated from the indel parameters. Equations for the transition probabilities make the pair HMM closely approximate the specified indel model. The method provides an optimal alignment, its likelihood, the likelihood of all possible alignments, and the reliability of individual alignment regions. Human alpha and beta-hemoglobin sequences are aligned, as an illustration of the potential utility of this pair HMM approach. 相似文献

11.

Indel seeds for homology search

Mak D Gelfand Y Benson G 《Bioinformatics (Oxford, England)》2006,22(14):e341-e349

We are interested in detecting homologous genomic DNA sequences with the goal of locating approximate inverted, interspersed, and tandem repeats. Standard search techniques start by detecting small matching parts, called seeds, between a query sequence and database sequences. Contiguous seed models have existed for many years. Recently, spaced seeds were shown to be more sensitive than contiguous seeds without increasing the random hit rate. To determine the superiority of one seed model over another, a model of homologous sequence alignment must be chosen. Previous studies evaluating spaced and contiguous seeds have assumed that matches and mismatches occur within these alignments, but not insertions and deletions (indels). This is perhaps appropriate when searching for protein coding sequences (<5% of the human genome), but is inappropriate when looking for repeats in the majority of genomic sequence where indels are common. In this paper, we assume a model of homologous sequence alignment which includes indels and we describe a new seed model, called indel seeds, which explicitly allows indels. We present a waiting time formula for computing the sensitivity of an indel seed and show that indel seeds significantly outperform contiguous and spaced seeds when homologies include indels. We discuss the practical aspect of using indel seeds and finally we present results from a search for inverted repeats in the dog genome using both indel and spaced seeds. 相似文献

12.

Scoring insertion-deletion polymorphisms by dynamic allele-specific hybridization

Sawyer SL Howell WM Brookes AJ 《BioTechniques》2003,35(2):292-6, 298

Genome variation provides researchers with thousands of markers with which to study human demographic history and phenotypes. Insertion-deletion (indel) polymorphism is an important and abundant form of human genome variation, and convenient methods for genotyping indels are therefore needed. Here we evaluate dynamic allele-specific hybridization (DASH) for its ability to score indels. Evaluation of six model indel DASH assays based on synthetic oligonucleotides showed that length differences of 1-5 bp were accurately scored. Only single probes were required to assay indels of 3-4 bp or less, while longer indels tended to require the use of both allele probes serially. The best results were obtained by central placing of the probe over the indel. Model study findings were confirmed by running indel DASH assays upon PCR-amplified targets representing four polymorphisms from Alzheimer's disease candidate genes APBB1 and LRP1. These indels were genotyped in a set of 121 patients and 156 controls. While no disease association was found, the data quality confirmed that DASH is a robust and useful procedure for genotyping indels of the size range typically found in the human genome. 相似文献

13.

Development of indel markers from Citrus clementina (Rutaceae) BAC-end sequences and interspecific transferability in Citrus

Ollitrault F Terol J Martin AA Pina JA Navarro L Talon M Ollitrault P 《American journal of botany》2012,99(7):e268-e273

? Premise of the study: Indel markers were developed from BAC-end sequences of Citrus clementina cv. Nules. Transferability and polymorphism were tested in the Citrus genus to estimate the potential of indel markers mined from a single genotype for use in genetic studies. ? Methods and Results: Using polyacrylamide gel electrophoresis and DNA silver staining, 89 indel markers were tested for their transferability and polymorphism. Thirty-eight markers were selected. Heterozygosity in C. clementina cv. Nules was confirmed for 33 of these indel pairs. A preliminary diversity study using a capillary electrophoresis fragment analyzer was conducted with 21 indels using 45 accessions representing Citrus genus diversity. Intraspecific and interspecific polymorphisms were observed. ? Conclusions: These results indicate the utility of indel markers developed from sequence data of a single genotype of interspecific origin. In Citrus, these markers will be useful for genetic mapping, germplasm characterization, and phylogenetic assignment of DNA fragments. 相似文献

14.

Analysis of the indel at the ARMS2 3′UTR in age-related macular degeneration

Gaofeng Wang Kylee L. Spencer William K. Scott Patrice Whitehead Brenda L. Court Juan Ayala-Haedo Ping Mayo Stephen G. Schwartz Jaclyn L. Kovach Paul Gallins Monica Polk Anita Agarwal Eric A. Postel Jonathan L. Haines Margaret A. Pericak-Vance 《Human genetics》2010,127(5):595-602

相似文献

15.

Identification of single nucleotide polymorphism in ginger using expressed sequence tags

Arumugam Chandrasekar Aikkal Riju Kandiyl Sithara Sahadevan Anoop Santhosh J Eapen 《Bioinformation》2009,4(3):119-122

相似文献

16.

The genome-wide landscape of small insertion and deletion mutations in Monopterus albus

Feng Chen Majing Luo Yu-San Han Hanhua Cheng Rongjia Zhou 《遗传学报》2019,46(2):75-86

相似文献

17.

Probabilistic whole-genome alignments reveal high indel rates in the human and mouse genomes

Lunter G 《Bioinformatics (Oxford, England)》2007,23(13):i289-i296

MOTIVATION: The two mutation processes that have the largest impact on genome evolution at small scales are substitutions, and sequence insertions and deletions (indels). While the former have been studied extensively, indels have received less attention, and in particular, the problem of inferring indel rates between pairs of divergent sequence remains unsolved. Here, I describe a novel and accurate method for estimating neutral indel rates between divergent pairs of genomes. RESULTS: Simulations suggest that new method for estimating indel rates is accurate to within 2%, at divergences corresponding to that of human and mouse. Applying the method to these species, I show that indel rates are up to twice higher than is apparent from alignments, and depend strongly on the local G + C content. These results indicate that at these evolutionary distances, the contribution of indels to sequence divergence is much larger than hitherto appreciated. In particular, the ratio of substitution to indel rates between human and mouse appears to be around gamma = 8, rather than the currently accepted value of about gamma = 14. 相似文献

18.

Incorporating indels as phylogenetic characters: Impact for interfamilial relationships within Arctoidea (Mammalia: Carnivora)

Peng-tao Luan Oliver A. Ryder Heidi Davis Ya-ping Zhang Li Yu 《Molecular phylogenetics and evolution》2013,66(3):748-756

Insertion and deletion events (indels) provide a suite of markers with enormous potential for molecular phylogenetics. Using many more indel characters than those in previous studies, we here for the first time address the impact of indel inclusion on the phylogenetic inferences of Arctoidea (Mammalia: Carnivora). Based on 6843 indel characters from 22 nuclear intron loci of 16 species of Arctoidea, our analyses demonstrate that when the indels were not taken into consideration, the monophyly of Ursidae and Pinnipedia tree and the monophyly of Pinnipedia and Musteloidea tree were both recovered, whereas inclusion of indels by using three different indel coding schemes give identical phylogenetic tree topologies supporting the monophyly of Ursidae and Pinnipedia. Our work brings new perspectives on the previously controversial placements among Arctoidea families, and provides another example demonstrating the importance of identifying and incorporating indels in the phylogenetic analyses of introns. In addition, comparison of indel incorporation methods revealed that the three indel coding methods are all advantageous over treating indels as missing data, given that incorporating indels produces consistent results across methods. This is the first report of the impact of different indel coding schemes on phylogenetic reconstruction at the family level in Carnivora, which indicates that indels should be taken into account in the future phylogenetic analyses. 相似文献

19.

Tool evaluation for the detection of variably sized indels from next generation whole genome and targeted sequencing data

Ning Wang Vladislav Lysenkov Katri Orte Veli Kairisto Juhani Aakko Sofia Khan Laura L. Elo 《PLoS computational biology》2022,18(2)

Insertions and deletions (indels) in human genomes are associated with a wide range of phenotypes, including various clinical disorders. High-throughput, next generation sequencing (NGS) technologies enable the detection of short genetic variants, such as single nucleotide variants (SNVs) and indels. However, the variant calling accuracy for indels remains considerably lower than for SNVs. Here we present a comparative study of the performance of variant calling tools for indel calling, evaluated with a wide repertoire of NGS datasets. While there is no single optimal tool to suit all circumstances, our results demonstrate that the choice of variant calling tool greatly impacts the precision and recall of indel calling. Furthermore, to reliably detect indels, it is essential to choose NGS technologies that offer a long read length and high coverage coupled with specific variant calling tools. 相似文献

20.

Indel information eliminates trivial sequence alignment in maximum likelihood phylogenetic analysis

John S.S. Denton Ward C. Wheeler 《Cladistics : the international journal of the Willi Hennig Society》2012,28(5):514-528

Although there has been a recent proliferation in maximum‐likelihood (ML)‐based tree estimation methods based on a fixed sequence alignment (MSA), little research has been done on incorporating indel information in this traditional framework. We show, using a simple model on a single character example, that a trivial alignment of a different form than that previously identified for parsimony is optimal in ML under standard assumptions treating indels as “missing” data, but that it is not optimal when indels are incorporated into the character alphabet. We show that the optimality of the trivial alignment is not an artefact of simplified theory assumptions by demonstrating that trivial alignment likelihoods of five different multiple sequence alignment datasets exhibit this phenomenon. These results demonstrate the need for use of indel information in likelihood analysis on fixed MSAs, and suggest that caution must be exercised when drawing conclusions from software implementations claiming improvements in likelihood scores under an indels‐as‐missing assumption. © The Willi Hennig Society 2012. 相似文献