首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
A fundamental task in sequence analysis is to calculate the probability of a multiple alignment given a phylogenetic tree relating the sequences and an evolutionary model describing how sequences change over time. However, the most widely used phylogenetic models only account for residue substitution events. We describe a probabilistic model of a multiple sequence alignment that accounts for insertion and deletion events in addition to substitutions, given a phylogenetic tree, using a rate matrix augmented by the gap character. Starting from a continuous Markov process, we construct a non-reversible generative (birth-death) evolutionary model for insertions and deletions. The model assumes that insertion and deletion events occur one residue at a time. We apply this model to phylogenetic tree inference by extending the program dnaml in phylip. Using standard benchmarking methods on simulated data and a new "concordance test" benchmark on real ribosomal RNA alignments, we show that the extended program dnamlepsilon improves accuracy relative to the usual approach of ignoring gaps, while retaining the computational efficiency of the Felsenstein peeling algorithm.  相似文献   

2.
We have sequenced the heavy and light chain genes from 365 IgG(+) B cells and found that 24 (6.5 %) contain somatically introduced insertions or deletions. These insertions and deletions are clustered at "hot-spots" in the antigen-binding site and frequently result in the creation of new combinations of canonical loop structures or entirely new loops that are not present in the human germline repertoire, but are similar to those seen in other species. Somatic insertion and deletion therefore provides a further mechanism for introducing structural diversity into antibodies in addition to somatic point mutation and receptor editing, which have small (single amino acid changes) and large (chain replacement) impacts on structural diversity, respectively.  相似文献   

3.
Many approaches to compute the genomic distance are still limited to genomes with the same content, without duplicated markers. However, differences in the gene content are frequently observed and can reflect important evolutionary aspects. While duplicated markers can hardly be handled by exact models, when duplicated markers are not allowed, a few polynomial time algorithms that include genome rearrangements, insertions and deletions were already proposed. In an attempt to improve these results, in the present work we give the first linear time algorithm to compute the distance between two multichromosomal genomes with unequal content, but without duplicated markers, considering insertions, deletions and double cut and join (DCJ) operations. We derive from this approach algorithms to sort one genome into another one also using DCJ operations, insertions and deletions. The optimal sorting scenarios can have different compositions and we compare two types of sorting scenarios: one that maximizes and one that minimizes the number of DCJ operations with respect to the number of insertions and deletions. We also show that, although the triangle inequality can be disrupted in the proposed genomic distance, it is possible to correct this problem adopting a surcharge on the number of non-common markers. We use our method to analyze six species of Rickettsia, a group of obligate intracellular parasites, and identify preliminary evidence of clusters of deletions.  相似文献   

4.
5.
Analysis of insertions/deletions in protein structures.   总被引:17,自引:0,他引:17  
An analysis of insertions and deletions (indels) occurring in a databank of multiple sequence alignments based on protein tertiary structure is reported. Indels prefer to be short (1 to 5 residues). The average intervening sequence length between them versus the percentage of residue identity in pairwise alignments shows an exponential behaviour, suggesting a stochastic process such that nearly every loop in an ancestral structure is a possible target for indels during evolution. The results also suggest a limit to the average size of indels accommodated by protein structures. The preferred indel conformations are reverse turn and coil as are the preferred conformations at the indel edges (N- and C-terminal sides). Interruptions in helices and strands were observed as very rare events.  相似文献   

6.
7.
Insertions and deletions of nucleotides in the genes encoding the variable domains of antibodies are natural components of the hypermutation process, which may expand the available repertoire of hypervariable loop lengths and conformations. Although insertion of amino acids has also been utilized in antibody engineering, little is known about the functional consequences of such modifications. To investigate this further, we have introduced single-codon insertions and deletions as well as more complex modifications in the complementarity-determining regions of human antibody fragments with different specificities. Our results demonstrate that single amino acid insertions and deletions are generally well tolerated and permit production of stably folded proteins, often with retained antigen recognition, despite the fact that the thus modified loops carry amino acids that are disallowed at key residue positions in canonical loops of the corresponding length or are of a length not associated with a known canonical structure. We have thus shown that single-codon insertions and deletions can efficiently be utilized to expand structure and sequence space of the antigen-binding site beyond what is encoded by the germline gene repertoire.  相似文献   

8.
A new technique to evaluate methods for the synthesis of peptides was developed. It is based on the identification and quantitation of peptide by-products by mass spectrometry. Model oligopeptides containing 10 or 20 alanine residues were synthesized by automated solid phase methods using a variety of protocols, and the levels of deletion and insertion peptides were measured by the 252Cf fission fragment ionization time-of-flight spectrometric technique in which the total, unfractionated, synthetic product was deposited on a film of nitrocellulose and analyzed. The introduction of D-alanine at every third residue of the model eliminated peptide conformation problems that led to incomplete reactions in the all L model. Couplings with preformed symmetrical anhydrides in dimethylformamide gave rise to significant levels of both deletion peptides and insertion peptides. The best of the protocols examined was a double coupling of tert-butyloxycarbonyl-alanine by in situ activation with dicyclohexylcarbodiimide in dichloromethane. [D-Ala3,6,9,12,15,18]Ala20-Val was synthesized with an average deletion of only 0.036% per step and an average insertion of only 0.029% per step, which is equivalent to a stepwise yield of 99.93% for the target peptide.  相似文献   

9.

Background  

Probabilistic models for sequence comparison (such as hidden Markov models and pair hidden Markov models for proteins and mRNAs, or their context-free grammar counterparts for structural RNAs) often assume a fixed degree of divergence. Ideally we would like these models to be conditional on evolutionary divergence time.  相似文献   

10.

Background  

We describe the distribution of indels in the 44 Encyclopedia of DNA Elements (ENCODE) regions (about 1% of the human genome) and evaluate the potential contributions of small insertion and deletion polymorphisms (indels) to human genetic variation. We relate indels to known genomic annotation features and measures of evolutionary constraint.  相似文献   

11.
We studied the process by which whd, a P-element insertion allele of the Drosophila melanogaster white locus, is replaced by its homolog in the presence of transposase. These events are interpreted as the result of double-strand gap repair following excision of the P transposon in whd. We used a series of alleles derived from whd through P-element mobility as templates for this repair. One group of alleles, referred to collectively as whd-F, carried fragments of the P element that had lost some of the sequences needed in cis for mobility. The other group, whd-D, had lost all of the P insert and had some of the flanking DNA from white deleted. The average replacement frequencies were 43% for whd-F alleles and 7% for the whd-D alleles. Some of the former were converted at frequencies exceeding 50%. Our data suggest that the high conversion frequencies for the whd-F templates can be attributed at least in part to an elevated efficiency of repair of unexpanded gaps that is possibly caused by the closer match between whd-F sequences and the unexpanded gap endpoints. In addition, we found that the gene substitutions were almost exclusively in the direction of whd being replaced by the whd-F or whd-D allele rather than the reverse. The template alleles were usually unaltered in the process. This asymmetry implies that the conversion process is unidirectional and that the P fragments are not good substrates for P-element transposase. Our results help elucidate a highly efficient double-strand gap repair mechanism in D. melanogaster that can also be used for gene replacement procedures involving insertions and deletions. They also help explain the rapid spread of P elements in populations.  相似文献   

12.
We examined gross-anatomically the cruropedal muscles, which control the toe movements, in some species of insectivores, rodents and primates including humans, with a focus on the phylogenetic developments of these muscles including the distribution patterns of the tendons to the toes. Morphological changes corresponding to the phylogenetic advancement from primitive terrestrial mammals to arboreal primates were found in the short extensors and flexors, presumably in association with the enhancement of independent digital mobility. In contrast, the changes which correspond to the acquisition of terrestrial bipedality in humans were identified in the development of extensors and flexors which govern the first toe, as well as in establishment of the peroneus tertius that dorsi-flexes the talocrural joint.  相似文献   

13.
Insertions and deletions (indels) cause numerous genetic diseases and lead to pronounced evolutionary differences among genomes. The macaque sequences provide an opportunity to gain insights into the mechanisms generating these mutations on a genome-wide scale by establishing the polarity of indels occurring in the human lineage since its divergence from the chimpanzee. Here we apply novel regression techniques and multiscale analyses to demonstrate an extensive regional indel rate variation stemming from local fluctuations in divergence, GC content, male and female recombination rates, proximity to telomeres, and other genomic factors. We find that both replication and, surprisingly, recombination are significantly associated with the occurrence of small indels. Intriguingly, the relative inputs of replication versus recombination differ between insertions and deletions, thus the two types of mutations are likely guided in part by distinct mechanisms. Namely, insertions are more strongly associated with factors linked to recombination, while deletions are mostly associated with replication-related features. Indel as a term misleadingly groups the two types of mutations together by their effect on a sequence alignment. However, here we establish that the correct identification of a small gap as an insertion or a deletion (by use of an outgroup) is crucial to determining its mechanism of origin. In addition to providing novel insights into insertion and deletion mutagenesis, these results will assist in gap penalty modeling and eventually lead to more reliable genomic alignments.  相似文献   

14.
15.
A single-stage polymerase-based procedure is described that allows extensive modifications of DNA. The version described here uses the QuikChange Site-Directed Mutagenesis System kit supplied by Stratagene. The original protocol is replaced by a single-stage method in which linear production of complementary strands is accomplished in separate single primer reactions. This has proved effective in introducting insertions and deletions into large gene/vector combinations without subcloning.  相似文献   

16.
A simple and general method for disrupting chromosomal genes and introducing insertions is described. This procedure involves eliminating wild-type bacterial genes and introducing mutant alleles or other insertions at the original locus of the wild-type gene. To demonstrate the utility of this approach, the tig gene of Escherichia coli was replaced by homologous recombination with a cassette containing the chloramphenicol resistance gene and the sacB gene. The cassette was then removed and the tig mutant alleles were moved into the native tig location. Sequencing and Western blotting results demonstrated that insertions or deletions can be introduced precisely in E. coli using our approach. Our system does not require extra in vitro manipulations such as restriction digestion or ligation, and does not require use of specific plasmids or strains which are used to prevent false positive transformants caused by template plasmid transformation. This technique can be used widely in bacterial genome analysis.  相似文献   

17.
Tan EC  Li H 《Gene》2006,376(2):268-280
Most of the studies on single nucleotide variations are on substitutions rather than insertions/deletions. In this study, we examined the distribution and characteristics of single nucleotide insertions/deletions (SNindels), using data available from dbSNP for all the human chromosomes. There are almost 300,000 SNindels in the database, of which only 0.8% are validated. They occur at the frequency of 0.887 per 10 kb on average for the whole genome, or approximately 1 for every 11,274 bp. More than half occur in regions with mononucleotide repeats the longest of which is 47 bases. Overall the mononucleotide repeats involving C and G are much shorter than those for A and T. About 12% are surrounded by palindromes. There is general correlation between chromosome size and total number for each chromosome. Inter-chromosomal variation in density ranges from 0.6 to 21.7 per kilobase. The overall spectrum shows very high proportion of SNindel of types -/A and -/T at over 81%. The proportion of -/A and -/T SNindels for each chromosome is correlated to its AT content. Less than half of the SNindels are within or near known genes and even fewer (<0.183%) in coding regions, and more than 1.4% of -/C and -/G are in coding compared to 0.2% for -/A and -/T types. SNindels of -/A and -/T types make up 80% of those found within untranslated regions but less than 40% of those within coding regions. A separate analysis using the subset of 2324 validated SNindels showed slightly less AT bias of 74%, SNindels not within mononucleotide repeats showed even less AT bias at 58%. Density of validated SNindels is 0.007/10 kb overall and 90% are found within or near genes. Among all chromosomes, Y has the lowest numbers and densities for all SNindels, validated SNindels, and SNindels not within repeats.  相似文献   

18.
We report the presence of four nuclear paralogs of a 380-bp segment of cytochrome b in callitrichine primates (marmosets and tamarins). The mitochondrial cytochrome b sequence and each nuclear paralog were obtained from several species, allowing multiple comparisons of rates and patterns of substitution both between mitochondrial and nuclear sequences and among nuclear sequences. The mitochondrial DNA had high overall rates of molecular evolution and a strong bias toward substitutions at third codon positions. Rates of molecular evolution among the nuclear sequences were low and constant, and there were small differences in substitution patterns among the nuclear clades which were probably attributable to the small number of sites involved. A novel method of phylogenetic reconstruction based on the large difference in rates of evolution at different codon positions among mitochondrial and nuclear clades was used to determine whether different nuclear paralogs represent independent transposition events or duplications following a single insertion. This method is generally applicable in cases where differences in pattern of molecular evolution are known, and it showed that at least three of the four nuclear clades represent independent transposition events. The insertion events giving rise to two of the nuclear clades predate the divergence of the callitrichines, whereas those leading to the other two nuclear clades may have occurred in the common ancestor of marmosets.  相似文献   

19.
The human haptoglobin two-gene cluster (HP-HPR) contains two retrovirus-like elements. One (RTVL-Ia) is in the first intron of the HPR gene, and the second (RTVL-Ic) is at the 3'-end of the gene cluster. The chimpanzee three-gene cluster (HP-HPR-HPP) contains an additional, third copy (RTVL-Ib) in the intergenic region between HPR and HPP. RTVL-Ia and RTVL-Ib are essentially full size and have the general structure, 5'-LTR-gag-pol-env-3'-LTR, while RTVL-Ic lacks about one-third of its 5'-part. Although none of the elements has retained long open reading frames, we could detect stretches having amino acids identical to various parts of Moloney murine leukemia virus (Mo-MuLV) proteins. We conclude that the RTVL-I elements were derived from a virus very similar in structure to Mo-MuLV. The DNA sequences surrounding the insertion points of the three RTVL-I elements are not alike and allow the inference that they integrated into the haptoglobin gene cluster independently at some time after the initial formation of the triplicated gene cluster in primates. Comparison of the nucleotide sequences of the three elements leads to the hypothesis that foreign DNA introduced into the genome can initially accumulate mutations more rapidly than the genomic sequences surrounding it.  相似文献   

20.
Homology directed repair (HDR) defends cells against the toxic effects of two-ended double strand breaks (DSBs) and one-ended DSBs that arise when replication progression is inhibited, for example by encounter with DNA lesions such as interstrand crosslinks (ICLs). HDR can occur via various mechanisms, some of which are associated with an increased risk of concurrent sequence rearrangements that can lead to deletions, insertions, translocations and loss of heterozygosity. Here, we compared the risk of HDR-associated sequence rearrangements that occur spontaneously versus in response to exposure to an agent that induces ICLs. We describe the creation of two fluorescence-based direct repeat recombination substrates that have been targeted to the ROSA26 locus of embryonic stem cells, and that detect the major pathways of homologous recombination events, e.g., gene conversions with or without crossing over, repair of broken replication forks, and single strand annealing (SSA). SSA can be distinguished from other pathways by application of a matched pair of site-specifically integrated substrates, one of which allows detection of SSA, and one that does not. We show that SSA is responsible for a significant proportion of spontaneous homologous recombination events at these substrates, suggesting that two-ended DSBs are a common spontaneous recombinogenic lesion. Interestingly, exposure to mitomycin C (an agent that induces ICLs) increases the proportion of HDR events associated with deletions and insertions. Given that many chemotherapeutics induce ICLs, these results have important implications in terms of the risk of chemotherapy-induced deleterious sequence rearrangements that could potentially contribute to secondary tumors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号