首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Proteins are often monitored by combining a fluorescent polypeptide tag with the target protein. However, due to the high molecular weight and immunogenicity of such tags, they are not suitable choices for combining with fusion proteins such as immunotoxins. In this study, we designed a polypeptide sequence with a dual role (it acts as both a linker and a fluorescent probe) to use with fusion proteins. Two common fluorescent tag sequences based on tetracysteine were compared to a commonly used rigid linker as well as our proposed dual-purpose sequence. Computational investigations showed that the dual-purpose sequence was structurally stable and may be a good choice to use as both a linker and a fluorescence marker between two moieties in a fusion protein.  相似文献   

2.
The bacterial protein streptokinase (SK) contains three independently folded domains (α, β and γ), interconnected by two flexible linkers with noticeable sequence homology. To investigate their primary structure requirements, the linkers were swapped amongst themselves i.e. linker 1 (between α and β domains) was swapped with linker 2 (between β and γ domains) and vice versa. The resultant construct exhibited very low activity essentially due to an enhanced proteolytic susceptibility. However, a SK mutant with two linker 1 sequences, which was proteolytically as stable as WT-rSK retained about 10% of the plasminogen activator activity of rSK When the native sequence of each linker was substituted with 9 consecutive glycine sequences, in case of the linker 1 substitution mutant substantial activity was seen to survive, whereas the linker 2 mutant lost nearly all its activity. The optimal length of linkers was then studied through deletion mutagenesis experiments, which showed that deletion beyond three residues in either of the linkers resulted in virtually complete loss of activator activity. The effect of length of the linkers was then also examined by insertion of extraneous pentapeptide sequences having a propensity for adopting either an extended conformation or a relatively rigid conformation. The insertion of poly-Pro sequences into native linker 2 sequence caused up to 10-fold reduction in activity, whereas its effect in linker 1 was relatively minor. Interestingly, most of the linker mutants could form stable 1:1 complexes with human plasminogen. Taken together, these observations suggest that (i) the functioning of the inter-domain linkers of SK requires a critical minimal length, (ii) linker 1 is relatively more tolerant to insertions and sequence alterations, and appears to function primarily as a covalent connector between the α and β domains, and (iii) the native linker 2 sequence is virtually indispensable for the activity of SK probably because of structural and/or flexibility requirements in SK action during catalysis.  相似文献   

3.
Herold KE  Rasooly A 《BioTechniques》2003,35(6):1216-1221
Oligonucleotide microarrays have demonstrated potential for the analysis of gene expression, genotyping, and mutational analysis. Our work focuses primarily on the detection and identification of bacteria based on known short sequences of DNA. Oligo Design, the software described here, automates several design aspects that enable the improved selection of oligonucleotides for use with microarrays for these applications. Two major features of the program are: (i) a tiling algorithm for the design of short overlapping temperature-matched oligonucleotides of variable length, which are useful for the analysis of single nucleotide polymorphisms and (ii) a set of tools for the analysis of multiple alignments of gene families and related short DNA sequences, which allow for the identification of conserved DNA sequences for PCR primer selection and variable DNA sequences for the selection of unique probes for identification. Note that the program does not address the full genome perspective but, instead, is focused on the genetic analysis of short segments of DNA. The program is Internet-enabled and includes a built-in browser and the automated ability to download sequences from GenBank by specifying the GI number. The program also includes several utilities, including audio recital of a DNA sequence (useful for verifying sequences against a written document), a random sequence generator that provides insight into the relationship between melting temperature and GC content, and a PCR calculator.  相似文献   

4.
随着流感病毒基因组测序数据的急剧增加,深入挖掘流感病毒基因组大数据蕴含的生物学信息成为研究热点。基于中国流感病毒流行特征数据,建设一个集自动化、一体化和信息化的序列库系统,对于实现流感病毒基因组批量快速翻译、注释、存储、查询、分析具有重要的应用价值。本课题组通过集成一系列软件和工具包,并结合自主研发的其他功能,在底层维护的2个关键的参考数据集基础上另外追加了翻译注释信息最佳匹配的精细化筛选规则,构建具有流感病毒基因组信息存储、自动化翻译、蛋白序列精准注释、同源序列比对和进化树分析等功能的自动化系统。结果显示,通过Web端输入fasta格式的流感病毒基因序列,本系统可针对参考序列片段数据集(blastdb.fasta)进行Blast同源性检索,可以鉴定流感病毒的型别(A、B或C)、亚型和基因片段(1~8片段);在此基础上,通过查询数据库底层用于翻译、注释的基因片段参考数据集,可以获得一组肽段数据集,然后通过循环调用ProSplign软件对其进行预测。结合精细化的筛选准入规则,选出与输入序列匹配最好的翻译后产物,作为该输入序列的预测蛋白,输出为gbk,asn和fasta等通用格式的文件,给出序列长度、是否全长、病毒型别、亚型、片段等信息。基于以上工作,另外自主研发了系统其他的附加功能如进化树分析展示、基因组数据存储等功能,构建成基于Web服务的流感病毒基因组自动化翻译注释系统。本研究提示,系统高度集成系列软件以及自有的注释翻译数据库文件,实现从序列存储、翻译、注释到序列分析和展示的功能,可全面满足我国高通量基因检测数据共享化、本土化、一体化、自动化的需求。  相似文献   

5.
Green fluorescent protein (GFP) is often misfolded into nonfluorescent states when an aggregatable sequence is attached to its N-terminus. However, GFP fusions with highly aggregatable, prion-determining, and highly charged sequences from yeast prions, such as Sup35 and Ure2p, form green fibrils with properly folded GFP. To gain further insight into the general effect of an aggregatable sequence attached to fluorescent protein, we designed eight fusion proteins of a yellow variant of GFP (YFP) containing an aggregation-prone amyloidogenic sequence derived from human medin, attached via different lengths of linker sequence. Seven fusion proteins formed white fibrils lacking native YFP function. However, the fusion with an 18-residue medin sequence and a 50 amino acid linker formed fibrils with yellow color of folded YFP. Deconvolution analysis of infrared spectra also supports the presence of properly folded YFP in the fibrils formed by this protein. These results suggest that, the presence of an amyloidogenic sequence to a folded protein can promote the formation of fibrils and disrupt the native structures whereas the structure of the folded region is retained by optimizing sequences of amyloidogenic and linker regions.  相似文献   

6.
Fusion of enveloped viruses with their target membrane is mediated by viral integral glycoproteins. A conformational change of their ectodomain triggers membrane fusion. Several studies suggest that an extended, triple-stranded rod-shaped -helical coiled coil resembles a common structural and functional motif of the ectodomain of fusion proteins. From that, it is believed that essential features of the fusion process are conserved among the various enveloped viruses. However, this has not been established so far for the highly conserved transmembrane and intraviral sequences of fusion proteins. The article will focus on the role of both sequences in the fusion process. Recent studies from various enveloped viruses strongly imply that a transmembrane domain with a minimum length is required for later steps of membrane fusion, i.e., the formation and enlargement of the aqueous fusion pore. Although no specific sequence of the TM is necessary for pore formation, distinct properties and motifs of the domain may be obligatory to ascertain full fusion activity. However, with some exceptions, the intraviral domain seems to be not required for fusion activity of viral fusion proteins.  相似文献   

7.
MOTIVATION: RNA secondary structure analysis often requires searching for potential helices in large sequence data. RESULTS: We present a utility program GUUGle that efficiently locates potential helical regions under RNA base pairing rules, which include Watson-Crick as well as G-U pairs. It accepts a positive and a negative set of sequences, and determines all exact matches under RNA rules between positive and negative sequences that exceed a specified length. The GUUGle algorithm can also be adapted to use a precomputed suffix array of the positive sequence set. We show how this program can be effectively used as a filter preceding a more computationally expensive task such as miRNA target prediction. AVAILABILITY: GUUGle is available via the Bielefeld Bioinformatics Server at http://bibiserv.techfak.uni-bielefeld.de/guugle  相似文献   

8.
The analysis and comparison of large numbers of immunoglobulin (Ig) sequences that arise during an antibody selection campaign can be time‐consuming and tedious. Typically, the identification and annotation of framework as well as complementarity‐determining regions (CDRs) is based on multiple sequence alignments using standardized numbering schemes, which allow identification of equivalent residues among different family members but often necessitate expert knowledge and manual intervention. Moreover, due to the enormous length variability of some CDRs the benefit of conventional Ig numbering schemes is limited and the calculation of correct sequence alignments can become challenging. Whereas, in principle, a well established set of rules permits the assignment of CDRs from the amino acid sequence alone, no currently available sequence alignment editor provides an algorithm to annotate new Ig sequences accordingly. Here we present a unique pattern matching method implemented into our recently developed ANTIC ALIgN editor that automatically identifies all hypervariable and framework regions in experimentally elucidated antibody sequences using so‐called “regular expressions.” By combination of this widely supported software syntax with the unique capabilities of real‐time aligning, editing and analyzing extended sets of amino acid and/or nucleotide sequences simultaneously on a local workstation, ANTIC ALIgN provides a powerful utility for antibody engineering. Proteins 2016; 85:65–71. © 2016 Wiley Periodicals, Inc.  相似文献   

9.
SUMMARY: Insertional mutagenesis is a powerful method for gene discovery. To identify the location of insertion sites in the genome linker based polymerase chain reaction (PCR) methods (such as splinkerette-PCR) may be employed. We have developed a web application called iMapper (Insertional Mutagenesis Mapping and Analysis Tool) for the efficient analysis of insertion site sequence reads against vertebrate and invertebrate Ensembl genomes. Taking linker based sequences as input, iMapper scans and trims the sequence to remove the linker and sequences derived from the insertional mutagen. The software then identifies and removes contaminating sequences derived from chimeric genomic fragments, vector or the transposon concatamer and then presents the clipped sequence reads to a sequence mapping server which aligns them to an Ensembl genome. Insertion sites can then be navigated in Ensembl in the context of genomic features such as gene structures. iMapper also generates test-based format for nucleic acid or protein sequences (FASTA) and generic file format (GFF) files of the clipped sequence reads and provides a graphical overview of the mapped insertion sites against a karyotype. iMapper is designed for high-throughput applications and can efficiently process thousands of DNA sequence reads. AVAILABILITY: iMapper is web based and can be accessed at http://www.sanger.ac.uk/cgi-bin/teams/team113/imapper.cgi.  相似文献   

10.
We have designed a computer program which rapidly scans nucleic acid sequences to select all possible pairs of oligonucleotides suitable for use as primers to direct efficient DNA amplification by the polymerase chain reaction. This program is based on a set of rules which define in generic terms both the sequence composition of the primers and the amplified region of DNA. These rules (1) enhance primer-to-target sequence hybridization avidity at critical 3'-end extension initiation sites, (2) facilitate attainment of full length extension during the 72 degrees C phase, by minimizing generation of incomplete or nonspecific product and (3) limit primer losses occurring from primer-self or primer-primer homologies. Three examples of primer sets chosen by the program that correctly amplified the target regions starting from RNA are shown. This program should facilitate the rapid selection of effective and specific primers from long gene sequences while providing a flexible choice of various primers to focus study on particular regions of interest.  相似文献   

11.
We have extended the cDNA sequence of bovine interphotoreceptor retinoid-binding protein (IRBP) and subcloned one of the sequenced cDNA fragments into an expression vector. The nucleotide (nt) sequences of four bovine IRBP cDNA clones have been determined. These sequences when assembled cover the 3' proximal 3629 nt of the IRBP mRNA and encode the C-terminal 551 amino acids (aa) of IRBP. This cDNA sequence validates the intron: exon boundaries predicted from the gene. A 2-kb EcoRI insert from lambda IRBP2, one of the clones sequenced, encoding the C-terminal 136 aa of IRBP was subcloned into the expression vector pWR590-1. Escherichia coli carrying this plasmid construction, pXS590-IRBP, produced a fusion protein containing 583 N-terminal aa of beta-galactosidase, three linker aa residues, 136 C-terminal aa of IRBP and possibly a number of additional C-terminal residues due to suppressed termination. This 86-kDa fusion protein, purified by detergent/chaotrope extraction followed by reverse-phase high-performance liquid chromatography, cross-reacted with anti-bovine IRBP on Western blots. This protein induced an experimental autoimmune uveo-retinitis and experimental autoimmune pinealitis in Lewis rats indistinguishable from that induced by authentic bovine IRBP. Thus, it is evident that biological activity of this region of IRBP, as manifested by immuno-pathogenicity, is retained by the fusion protein.  相似文献   

12.
DomCut: prediction of inter-domain linker regions in amino acid sequences   总被引:2,自引:0,他引:2  
DomCut is a program to predict inter-domain linker regions solely by amino acid sequence information. The prediction is made by using linker index deduced from a data set of domain/linker segments. The linker preference profile, which is the averaged linker index along a sequence, can be visualized in the graphical interface.  相似文献   

13.
AMIGene: Annotation of MIcrobial Genes   总被引:11,自引:0,他引:11       下载免费PDF全文
AMIGene (Annotation of MIcrobial Genes) is an application for automatically identifying the most likely coding sequences (CDSs) in a large contig or a complete bacterial genome sequence. The first step in AMIGene is dedicated to the construction of Markov models that fit the input genomic data (i.e. the gene model), followed by the combination of well-known gene-finding methods and an heuristic approach for the selection of the most likely CDSs. The web interface allows the user to select one or several gene models applied to the analysis of the input sequence by the AMIGene program and to visualize the list of predicted CDSs graphically and in a downloadable text format. The AMIGene web site is accessible at the following address: http://www.genoscope.cns.fr/agc/tools/amigene/index.html (Contact: sbocs@genoscope.cns.fr).  相似文献   

14.
Pyrrolo[2,1-c][1,4]benzodiazepine (PBD) dimers are synthetic sequence-selective DNA minor-groove cross-linking agents that possess two electrophilic imine moieties (or their equivalent) capable of forming covalent aminal linkages with guanine C2-NH2 functionalities. The PBD dimer SJG-136, which has a C8–O–(CH2)3–O–C8′′ central linker joining the two PBD moieties, is currently undergoing phase II clinical trials and current research is focused on developing analogues of SJG-136 with different linker lengths and substitution patterns. Using a reversed-phase ion pair HPLC/MS method to evaluate interaction with oligonucleotides of varying length and sequence, we recently reported (JACS, 2009, 131, 13 756) that SJG-136 can form three different types of adducts: inter- and intrastrand cross-linked adducts, and mono-alkylated adducts. These studies have now been extended to include PBD dimers with a longer central linker (C8–O–(CH2)5–O–C8′), demonstrating that the type and distribution of adducts appear to depend on (i) the length of the C8/C8′-linker connecting the two PBD units, (ii) the positioning of the two reactive guanine bases on the same or opposite strands, and (iii) their separation (i.e. the number of base pairs, usually ATs, between them). Based on these data, a set of rules are emerging that can be used to predict the DNA–interaction behaviour of a PBD dimer of particular C8–C8′ linker length towards a given DNA sequence. These observations suggest that it may be possible to design PBD dimers to target specific DNA sequences.  相似文献   

15.
Comparative sequence analysis is a powerful approach to identify functional elements in genomic sequences. Herein, we describe AGenDA (Alignment-based GENe Detection Algorithm), a novel method for gene prediction that is based on long-range alignment of syntenic regions in eukaryotic genome sequences. Local sequence homologies identified by the DIALIGN program are searched for conserved splice signals to define potential protein-coding exons; these candidate exons are then used to assemble complete gene structures. The performance of our method was tested on a set of 105 human-mouse sequence pairs. These test runs showed that sensitivity and specificity of AGenDA are comparable with the best gene- prediction program that is currently available. However, since our method is based on a completely different type of input information, it can detect genes that are not detectable by standard methods and vice versa. Thus, our approach seems to be a useful addition to existing gene-prediction programs. Availability: DIALIGN is available through the Bielefeld Bioinformatics Server (BiBiServ) at http://bibiserv.techfak.uni-bielefeld.de/dialign/ The gene-prediction program AGenDA described in this paper will be available through the BiBiServ or MIPS web server at http://mips.gsf.de.  相似文献   

16.
重组PCR是通过DNA重叠序列的衔接作用,使多个DNA分子融合在一起的体外扩增技术。它使基因全序列的拼接、基因融合、基因破坏及启动子交换等DNA操作变得简单易行。如今重组PCR已成为DNA分析的有效利器。本研究通过重组PCR在分子进化、基因敲除及基因敲入、启动子研究和转基因植物转化载体的构建等方面的实际应用,分析了该技术的影响因素,并针对引物设计、DNA碱基重叠长度、温度参数等重要反应条件提出了优化方案。  相似文献   

17.
In this paper we present a branch and bound algorithm for local gapless multiple sequence alignment (motif alignment) and its implementation. The algorithm uses both score-based bounding and a novel bounding technique based on the "consistency" of the alignment. A sequence order independent search tree is used in conjunction with a technique for avoiding redundant calculations inherent in the structure of the tree. This is the first program to exploit the fact that the motif alignment problem is easier for short motifs. Indeed, for a short fixed motif width, the running time of the algorithm is asymptotically linear in the size of the input. We tested the performance of the program on a dataset of 300 E. coli promoter sequences and a dataset of 85 lipocalin protein sequences. For a motif width of 4, the optimal alignment of the entire set of sequences can be found. For the more natural motif width of 6, the program can align 21 sequences of length 100, more than twice the number of sequences which can be aligned by the best previous exact algorithm. The algorithm can relax the constraint of requiring each sequence to be aligned, and align 105 of the 300 promoter sequences with a motif width of 6. For the lipocalin dataset, we introduce a technique for reducing the effective alphabet size with a minimal loss of useful information. With this technique, we show that the program can find meaningful motifs in a reasonable amount of time by optimizing the score over three motif positions.  相似文献   

18.
MOTIVATION: Biologically important proteins are often large, multidomain proteins, which are difficult to characterize by high-throughput experimental methods. Efficient domain/boundary predictions are thus increasingly required in diverse area of proteomics research for computationally dissecting proteins into readily analyzable domains. RESULTS: We constructed a support vector machine (SVM)-based domain linker predictor, DROP (Domain linker pRediction using OPtimal features), which was trained with 25 optimal features. The optimal combination of features was identified from a set of 3000 features using a random forest algorithm complemented with a stepwise feature selection. DROP demonstrated a prediction sensitivity and precision of 41.3 and 49.4%, respectively. These values were over 19.9% higher than those of control SVM predictors trained with non-optimized features, strongly suggesting the efficiency of our feature selection method. In addition, the mean NDO-Score of DROP for predicting novel domains in seven CASP8 FM multidomain proteins was 0.760, which was higher than any of the 12 published CASP8 DP servers. Overall, these results indicate that the SVM prediction of domain linkers can be improved by identifying optimal features that best distinguish linker from non-linker regions.  相似文献   

19.
Structural investigations are frequently hindered by difficulties in obtaining diffracting crystals of the target protein. Here, we report the crystallization and structure solution of the U2AF homology motif (UHM) domain of splicing factor Puf60 fused to Escherichia coli thioredoxin A. Both modules make extensive crystallographic contacts, contributing to a well-defined crystal lattice with clear electron density for both the thioredoxin and the Puf60-UHM module. We compare two short linker sequences between the two fusion domains, GSAM and GSPPM, for which only the GSAM-linked fusion protein yielded diffracting crystals. While specific interdomain contacts are not observed for both fusion proteins, NMR relaxation data in solution indicate reduced interdomain mobility between the Trx and Puf60-UHM modules. The GSPPM-linked fusion protein is significantly more flexible, albeit both linker sequences have the same number of degrees of torsional freedom. Our analysis provides a rationale for the crystallization of the GSAM-linked fusion protein and indicates that in this case, a four-residue linker between thioredoxin A and the fused target may represent the maximal length for crystallization purposes. Our data provide an experimental basis for the rational design of linker sequences in carrier-driven crystallization and identify thioredoxin A as a powerful fusion partner that can aid crystallization of difficult targets.  相似文献   

20.
To facilitate swift structural characterizations, structural genomic/proteomic projects need to divide large multi-domain proteins into structural domains and to determine their structures separately. Thus, the assignment of structural domains based solely on sequence information, especially on the physico-chemical properties of the amino acid sequences, could be very helpful for such projects. In this study, we examined the characteristics of domain linker sequences, which are loop sequences connecting two structural domains. To this end, we prepared a set of 101 non-redundant multi-domain protein sequences with known structures, and performed an analysis of the linker sequences. The analysis revealed that the frequencies of five (Pro, Gly, Asp, Asn, Lys) amino acid residues differed significantly between the linker and non-linker loop sequences. Moreover, we observed a similar deviation for the residue pair frequencies between the two types of loop sequences. Finally, we describe an automated method, based on the above analysis, to detect loops that have high probabilities of being domain linkers in a protein sequence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号