首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have developed a computer program which predicts internal exons from naive genomic sequence data and which will run on any IBM-compatible 80286 (or higher) computer. The algorithm searches a sequence for 'spliceable open reading frames' (SORFs), which are open reading frames bracketed by suitable splice-recognition sequences, and then analyzes the region for codon usage. Potential exons are stratified according to the reliability of their prediction, from confidence levels 1 to 5. The program is designed to predict internal exons of length greater than 60 nucleotides. In an analysis of 116 genes of a training set, 384 out of 441 such exons (87.1%) are identified, with 280 (63.5%) of predictions matching the true exon exactly (at both 5' and 3' splice junctions and in the correct reading frame), and with 104 (23.6%) exons matching partially. In a similar analysis of 14 genes in a test set unrelated to the genes used to generate the parameters of the program, 70 out of 80 internal exons greater than 60 bp in length are identified (87.5%), with 47 completely and 23 partially matched. SORFs that partially match true internal exons share at least one splice junction with the exon, or share both splice junctions but are interpreted in an incorrect reading frame. Specificity (the percentage of SORFs that correspond to true exons) varies from 91% at confidence level 1 to 16% at confidence level 5, with an overall specificity of 35-40%. The output displays nucleotide position, confidence level, reading frame phase at the 5' and 3' ends, acceptor and donor sequences and scoring statistics and also gives an amino acid translation of the potential exon. SORFIND compares favourably with other programs currently used to predict protein-coding regions.  相似文献   

2.
We describe a set of IBM-compatible computer programs designed to selectively identify the potential sites for silent mutagenesis within a target DNA sequence. This program is based on a novel strategy of identifying amino acid motifs compatible with each restriction site (BioTechniques 12:382-384, 1991). The programs can be used to identify the suitability for the introduction of any 6-base nucleic acid sequences, such as restriction enzyme sites in cassette mutagenesis strategies. The Table program generates a table of multiple amino acid motifs for each restriction enzyme, obtained by translating each unique recognition sequence in all three reading frames. The Silmut program, which utilizes the features of Table, will further identify the presence of a match between any amino acid motif of each restriction enzyme and the input target sequence. Minor manipulations of the data base files will enable the individual researcher to identify the potential for introduction of any 6-base sequences by silent mutagenesis.  相似文献   

3.
In this paper, we review developments in probabilistic methods of gene recognition in prokaryotic genomes with the emphasis on connections to the general theory of hidden Markov models (HMM). We show that the Bayesian method implemented in GeneMark, a frequently used gene-finding tool, can be augmented and reintroduced as a rigorous forward-backward (FB) algorithm for local posterior decoding described in the HMM theory. Another earlier developed method, prokaryotic GeneMark.hmm, uses a modification of the Viterbi algorithm for HMM with duration to identify the most likely global path through hidden functional states given the DNA sequence. GeneMark and GeneMark.hmm programs are worth using in concert for analysing prokaryotic DNA sequences that arguably do not follow any exact mathematical model. The new extension of GeneMark using the FB algorithm was implemented in the software program GeneMark.fba. Given the DNA sequence, this program determines an a posteriori probability for each nucleotide to belong to coding or non-coding region. Also, for any open reading frame (ORF), it assigns a score defined as a probabilistic measure of all paths through hidden states that traverse the ORF as a coding region. The prediction accuracy of GeneMark.fba determined in our tests was compared favourably to the accuracy of the initial (standard) GeneMark program. Comparison to the prokaryotic GeneMark.hmm has also demonstrated a certain, yet species-specific, degree of improvement in raw gene detection, ie detection of correct reading frame (and stop codon). The accuracy of exact gene prediction, which is concerned about precise prediction of gene start (which in a prokaryotic genome unambiguously defines the reading frame and stop codon, thus, the whole protein product), still remains more accurate in GeneMarkS, which uses more elaborate HMM to specifically address this task.  相似文献   

4.
The Northeast Structural Genomics Consortium (NESG) is one of nine NIH-funded pilot projects created to develop technologies needed for structural studies of proteins on a genome-wide scale. One of the most challenging aspects of this emerging field is the production of protein samples amenable to structural determination. To do this efficiently, all steps in the protein production pipeline must be automated. Here we describe the Primer program (linked from http://www-nmr.cabm.rutgers.edu/bioinformatics, www-nmr.cabm.rutgers.edu/bioinformatics, a web-based primer design program freely available to the scientific community, which was created to automate this time consuming and laborious task. This program has the ability to simultaneously calculate plasmid specific primer sets for multiple open reading frame (ORF) targets, including 96-well and greater formats. Primer includes a library of commonly used plasmid systems and possesses the ability to upload user-defined plasmid systems. In addition to calculating gene-specific annealing regions for each target, the program also adds appropriate restriction endonuclease recognition or viral recombination sites while preserving a reading frame with plasmid based fusions. Primer has several useful features such as sorting calculated primer sets by target size, facilitating interpretation of PCR amplifications by agarose gel electrophoresis, as well as supplying the molecular biologist with many important characteristics of each target such as the expected size of the PCR amplified DNA fragment and internal restriction sites. The NESG has cloned over 1500 genes using oligonucleotide primers designed by Primer.  相似文献   

5.
Microcomputer programs for DNA sequence analysis.   总被引:21,自引:5,他引:16       下载免费PDF全文
Computer programs are described which allow (a) analysis of DNA sequences to be performed on a laboratory microcomputer or (b) transfer of DNA sequences between a laboratory microcomputer and another computer system, such as a DNA library. The sequence analysis programs are interactive, do not require prior experience with computers and in many other respects resemble programs which have been written for larger computer systems (1-7). The user enters sequence data into a text file, accesses this file with the programs, and is then able to (a) search for restriction enzyme sites or other specified sequences, (b) translate in one or more reading frames in one or both directions in order to find open reading frames, or (c) determine codon usage in the sequence in one or more given reading frames. The results are given in table format and a restriction map is generated. The modem program permits collection of large amounts of data from a sequence library into a permanent file on the microcomputer disc system, or transfer of laboratory data in the reverse direction to a remote computer system.  相似文献   

6.
MOTIVATION: The advent of genomics yields thousands of reading frames in search of function. Identification of conserved functional motifs in protein sequences can be helpful for function prediction. RESULTS: A database and a classification of reported DNA-binding protein motifs has been designed. A program ('TranScout') has been developed for the detection and evaluation of conserved motifs in prokaryotic and eukaryotic sequences of proteins with a gene regulatory function. The efficiency of the program is shown in a benchmark against a database obtained from SWISS-PROT without the protein sequences used to train the program. All motifs were detected with a mean average sensitivity of 0.98 and a mean average specificity of 0.92. AVAILABILITY: The program is freely available for use on the internet at http://luz.uab.es/transcout/. The user can find additional information at this site.  相似文献   

7.
A method for refining the beginnings of genes and a search for shifts of the reading frame is proposed. The method is based on a comparison of nucleotide and amino acid sequences of homologous genes of related organisms. The algorithm is based on the fact that the rate of changes in the protein-coding regions of the genome is substantially lower than that of noncoding regions. A modification of the Smith-Waterman algorithm is proposed, which makes it possible to align the amino acid sequences obtained by formal translation of the starting nucleotide sequences by taking into account a possible shift of the reading frame. The algorithm has been implemented in the package of ORTOLOGATOR-GeneCorrector programs. Testing the program showed that the approach enables one to detect a wrong annotation of the beginnings in 1% of genes (even in well-studied organisms such as Escherichia coli) and identify several (approximately 10) shifts of the open reading frame. Thus, the algorithm can be used at both the initial and final stages of analysis of the genome.  相似文献   

8.
We have prepared a computer program that predicts complete and partial peptide maps from amino acid sequences. The program fragments amino acid sequences at designated cleavage sites and calculates the molecular weight and relative labeling of each peptide. These data are graphed as log molecular weight of the original protein (X-axis) vs. log molecular weight of the component peptides (Y-axis). The program is interactive, permitting adjustment of a number of graphic parameters and alteration of the position of proteins in the first dimension to accommodate aberrations in protein mobility. The program has been used to predict the V8 protease peptide maps of the 13 open reading frames (ORFs) identified in the human and the mouse mitochondrial DNA (mtDNA) sequences. The results were compared to the V8 protease peptide maps obtained for mouse and human mitochondrially synthesized proteins by two-dimensional proteolytic digest gels. A high correlation was observed between the predicted and observed peptide maps. These results suggest the assignment of several proteins to mtDNA genes.  相似文献   

9.
10.
11.
12.
A correspondence between open reading frames in sense and antisense strands is expected from the hypothesis that the prototypic triplet code was of general form RNY, where R is a purine base, N is any base, and Y is a pyrimidine. A deficit of stop codons in the antisense strand (and thus long open reading frames) is predicted for organisms with high G + C percentages; however, two bacteria (Azotobacter vinelandii, Rhodobacter capsulatum) have larger average antisense strand open reading frames than predicted from (G + C)%. The similar Codon frequencies found in sense and antisense strands can be attributed to the wide distribution of inverted repeats (stem-loop potential) in natural DNA sequences.  相似文献   

13.
以与普那霉素生物合成密切相关的新基因Afsk-like为探针,从始旋链霉菌F618基因组文库中筛选得到含有约8 kb的DNA片段.经测序分析表明,其上含有1个具有1 146个核苷酸的完整可阅读框,该基因被命名为Spr1(HQ450023),推测其编码1个含381个氨基酸的蛋白质产物.经Blastp程序进行分析得知,该基...  相似文献   

14.
A program is described for sequence data entry which allowsflexible program control by responding to both the keyboardand a sonic digitizer concurrently. Simplification of the initializationstage of each gel reading has been achieved, in comparison withother programs. Received on July 7, 1988; accepted on January 10, 1989  相似文献   

15.
Differences in how writing systems represent language raise important questions about whether there could be a universal functional architecture for reading across languages. In order to study potential language differences in the neural networks that support reading skill, we collected fMRI data from readers of alphabetic (English) and morpho-syllabic (Chinese) writing systems during two reading tasks. In one, participants read short stories under conditions that approximate natural reading, and in the other, participants decided whether individual stimuli were real words or not. Prior work comparing these two writing systems has overwhelmingly used meta-linguistic tasks, generally supporting the conclusion that the reading system is organized differently for skilled readers of Chinese and English. We observed that language differences in the reading network were greatly dependent on task. In lexical decision, a pattern consistent with prior research was observed in which the Middle Frontal Gyrus (MFG) and right Fusiform Gyrus (rFFG) were more active for Chinese than for English, whereas the posterior temporal sulcus was more active for English than for Chinese. We found a very different pattern of language effects in a naturalistic reading paradigm, during which significant differences were only observed in visual regions not typically considered specific to the reading network, and the middle temporal gyrus, which is thought to be important for direct mapping of orthography to semantics. Indeed, in areas that are often discussed as supporting distinct cognitive or linguistic functions between the two languages, we observed interaction. Specifically, language differences were most pronounced in MFG and rFFG during the lexical decision task, whereas no language differences were observed in these areas during silent reading of text for comprehension.  相似文献   

16.
17.
Xu D  Li G  Wu L  Zhou J  Xu Y 《Bioinformatics (Oxford, England)》2002,18(11):1432-1437
MOTIVATION: DNA microarray is a powerful high-throughput tool for studying gene function and regulatory networks. Due to the problem of potential cross hybridization, using full-length genes for microarray construction is not appropriate in some situations. A bioinformatic tool, PRIMEGENS, has recently been developed for the automatic design of PCR primers using DNA fragments that are specific to individual open reading frames (ORFs). RESULTS: PRIMEGENS first carries out a BLAST search for each target ORF against all other ORFs of the genome to quickly identify possible homologous sequences. Then it performs optimal sequence alignment between the target ORF and each of its homologous ORFs using dynamic programming. PRIMEGENS uses the sequence alignments to select gene- specific fragments, and then feeds the fragments to the Primer3 program to design primer pairs for PCR amplification. PRIMEGENS can be run from the command line on Unix/Linux platforms as a stand-alone package or it can be used from a Web interface. The program runs efficiently, and it takes a few seconds per sequence on a typical workstation. PCR primers specific to individual ORFs from Shewanella oneidensis MR-1 and Deinococcus radiodurans R1 have been designed. The PCR amplification results indicate that this method is very efficient and reliable for designing specific probes for microarray analysis.  相似文献   

18.
Inhibition of gammaherpesvirus replication by RNA interference   总被引:14,自引:0,他引:14       下载免费PDF全文
Jia Q  Sun R 《Journal of virology》2003,77(5):3301-3306
RNA interference (RNAi) is a conserved mechanism in which double-stranded, small interfering RNAs (siRNAs) trigger a sequence-specific gene-silencing process. Here we describe the inhibition of murine herpesvirus 68 replication by siRNAs targeted to sequences encoding Rta, an immediate-early protein known as an initiator of the lytic viral gene expression program, and open reading frame 45 (ORF 45), a conserved viral protein. Our results suggest that RNAi can block gammaherpesvirus replication and ORF 45 is required for efficient viral production.  相似文献   

19.
A personal computer program (COMPSEQ) has been developed whichcan present an informative listing of pre-aligned exonic nucleotidesequences and of their translations to amino acid sequencesas well run triplet-oriented analyses on these sequences ina given reading frame. The sequence listing focuses on the differencesbetween related sequences by suppressing the concordances betweenthem.  相似文献   

20.
《Anthrozo?s》2013,26(3):381-393
ABSTRACT

When explaining academic outcomes in specific content areas, people reveal their implicit theories of academic ability. Those who hold an entity theory generally attribute differences in achievement to stable, uncontrollable factors. In contrast, those who hold an incremental theory take into account controllable psychological or environmental variables. Implicit theories affect motivation and are expected to crystallize by about fourth grade. This research examined changes in southwest suburban third graders' implicit theories of reading ability for self, others, and other species in a quasi-experimental, crossover design employing entity and incremental treatments. Seventy-one third-graders completed a 16-week reading program teaching a dog tasks that supported and challenged entity theories of what dogs can do. A therapy dog acted as our confederate because reading to dogs has been shown to improve children's reading skills, but not necessarily change their beliefs about reading ability, because beliefs are resistant to change and require personal experiences that encourage revision. Repeated measures analysis of co-variance (ANCOVA) revealed a significant change in students' theories of reading ability (F(1, 59) = 60.61, p < 0.001). Students' incremental scores increased following the entity condition (F(1, 64) = 1.165, p < 0.02); their entity scores decreased following both conditions (F(1, 59) = 21.90, p < 0.001). Students' implicit theories of reading ability for self, other, and other species did not differ; a significant effect of belief in dogs' reading ability (F(1, 59) = 29.04 p < 0.001) was observed. Implications for increasing children's reading motivation and achievement are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号