首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
基于Windows的核酸序列分析软件的开发   总被引:7,自引:0,他引:7  
随着基因组计划的发展和基因分离技术的不断提高,大量的DNA序列需要进行分析以获得有用的生物学信息。然而当前开发的大多数序列分析软件或者使用功能比较单一,或者价格比较昂贵,不能很好的满足日常工作的需要。利用流行的Visual Basic语言进行核酸序列分析软件的开发,编制的BioXM软件能够满足包括翻译、ORF查找、序列联配、酶切位点分析、引物辅助设计、序列排列格式化、序列格式转换、载体序列去除等需要,达到了满意的应用效果。  相似文献   

2.
Dot-matrix sequence similarity searches can be greatly speeded up through use of a table listing all locations of short oligomers in one of the sequences to find potential similarities with a second sequence. The algorithm described finds similarities between two sequences of lengths M and N, comparing L residues at a time, with an efficiency of L X M X N/(SK) where S is the alphabet size, and k is the length of the oligomer. For nucleic acids, in which S = 4, use of a tetranucleotide table results in an efficiency of L X M X N/256. The simplicity of the approach allows for a straightforward calculation of the level of similarities expected to be found for given search parameters. Furthermore, the storage required is minimal, allowing for even large sequences to be compared on small microcomputers. Theoretical considerations regarding the use of this search are discussed.  相似文献   

3.
In comparative genomics studies, finding a minimum length sequences of reversals, so-called sorting by reversals, has been the topic of a huge literature. Since there are many minimum length sequences, another important topic has been the problem of listing all parsimonious sequences between two genomes, called the All Sorting Sequences by Reversals (ASSR) problem. In this article, we revisit the ASSR problem for uni-chromosomal genomes when no duplications are allowed and when the relative order of the genes is known. We put the current body of work in perspective by illustrating the fundamental framework that is common for all of them, a perspective that allows us for the first time to theoretically compare their running times. The article also proposes an improved framework that empirically speeds up all known algorithms.  相似文献   

4.
Influenza virus A (IVA) infection is responsible for recent death worldwide. Hence, there is a need to develop therapeutic agents against the virus. We describe the prediction of short interfering RNA (siRNA) as potential therapeutic molecules for the HA (Haemagglutinin) and NA (Neuraminidase) genes. We screened 90,522 siRNA candidates for HA and 13,576 for NA and selected 1006 and 1307 candidates for HA and NA, respectively based on the proportion of viral sequences that are targeted by the corresponding siRNA, with complete matches. Further short listing to select siRNA with no off-target hits, fulfilling all the guidelines mentioned in approach, provided us 13 siRNAs for haemagglutinin and 13 siRNAs for neuraminidase. The approach of finding siRNA using multiple sequence alignments of amino acid sequences has led to the identification of five conserved amino acid sequences, three in hemagglutinin i.e. RGLFGAIAGFIE, YNAELLV and AIAGFIE and two in neuraminidase i.e. RTQSEC and EECSYP which on reveres translation provided siRNA sequences as potential therapeutic candidates. The approaches used during this study have enabled us to identify potentially therapeutic siRNAs against divergent IVA strains.  相似文献   

5.
MOTIVATION: The article presents results of the listing of the quantity of amino acids, dipeptides and tripeptides for all proteins available in the UNIPROT-TREMBL database and the listing for selected species and enzymes. UNIPROT-TREMBL contains protein sequences associated with computationally generated annotations and large-scale functional characterization. Due to the distinct metabolic pathways of amino acid syntheses and their physicochemical properties, the quantities of subpeptides in proteins vary. We have proved that the distribution of amino acids, dipeptides and tripeptides is statistical which confirms that the evolutionary biodiversity development model is subject to the theory of independent events. It seems interesting that certain short peptide combinations occur relatively rarely or even not at all. First, it confirms the Darwinian theory of evolution and second, it opens up opportunities for designing pharmaceuticals among rarely represented short peptide combinations. Furthermore, an innovative approach to the mass analysis of bioinformatic data is presented. CONTACT: eitner@amu.edu.pl SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

6.
A web-based resource, Microbial Community Analysis (MiCA), has been developed to facilitate studies on microbial community ecology that use analyses of terminal-restriction fragment length polymorphisms (T-RFLP) of 16S and 18S rRNA genes. MiCA provides an intuitive web interface to access two specialized programs and a specially formatted database of 16S ribosomal RNA sequences. The first program performs virtual polymerase chain reaction (PCR) amplification of rRNA genes and restriction of the amplicons using primer sequences and restriction enzymes chosen by the user. This program, in silico PCR and Restriction (ISPaR), uses a binary encoding of DNA sequences to rapidly scan large numbers of sequences in databases searching for primer annealing and restriction sites while permitting the user to specify the number of mismatches in primer sequences. ISPaR supports multiple digests with up to three enzymes. The number of base pairs between the 5′ and 3′ primers and the proximal restriction sites can be reported, printed, or exported in various formats. The second program, APLAUS, infers a plausible community structure(s) based on T-RFLP data supplied by a user. APLAUS estimates the relative abundances of populations and reports a listing of phylotypes that are consistent with the empirical data. MiCA is accessible at .  相似文献   

7.
How is a species' propensity to vary in size over time related to its chances to be listed on the Red List (Red Book), that is to face the risk of extinction? Here I suggest that the linkage between the range of population fluctuations and chance of listing may be established on the basis of the ratio of species' annual fecundity to annual mortality of adults, B/M. For 23 mammalian species inhabiting the vast tracts of Western Siberia, the range of population fluctuations is found to be equal or at least proportional to B/M. Further, this ratio is estimated for a larger set of 90 mammalian species from the territory and coastal waters of the former Soviet Union, of which 25 species are listed in the Red Book of the USSR. The distribution of the Red Book species over the gradient of B/M is clearly non-uniform: most of them have low B/M, which leads to a negative relationship between chance of listing and B/M. The positive relation between population size variability and B/M and the negative relation between chance of listing and B/M suggest the resulting negative relation between chance of listing and population size variability. Both the positive and the negative relations of, respectively, population size variability and chance of listing on B/M are logically justified since B/M is a measure of the population growth rate: it is the total lifetime offspring and an upper estimate of the growth rate per generation time. Fast-growing, more resilient species tend to be more resistant to extinction and hence to have low chance of listing, but simultaneously are able to fluctuate widely. Thus, so long as population growth rate affects both chance of listing and population size variability, high variability implies low chance of listing.  相似文献   

8.
MOTIVATION: Clustering sequences of a full-length cDNA library into alternative splice form candidates is a very important problem. RESULTS: We developed a new efficient algorithm to cluster sequences of a full-length cDNA library into alternative splice form candidates. Current clustering algorithms for cDNAs tend to produce too many clusters containing incorrect splice form candidates. Our algorithm is based on a spliced sequence alignment algorithm that considers splice sites. The spliced sequence alignment algorithm is a variant of an ordinary dynamic programming algorithm, which requires O(nm) time for checking a pair of sequences where n and m are the lengths of the two sequences. Since the time bound is too large to perform all-pair comparison for a large set of sequences, we developed new techniques to reduce the computation time without affecting the accuracy of the output clusters. Our algorithm was applied to 21 076 mouse cDNA sequences of the FANTOM 1.10 database to examine its performance and accuracy. In these experiments, we achieved about 2-12-fold speedup against a method using only a traditional hash-based technique. Moreover, without using any information of the mouse genome sequence data or any gene data in public databases, we succeeded in listing 87-89% of all the clusters that biologists have annotated manually. AVAILABILITY: We provide a web service for cDNA clustering located at https://access.obigrid.org/ibm/cluspa/, for which registration for the OBIGrid (http://www.obigrid.org) is required.  相似文献   

9.
The alignment of homologous sequences with each other and theirdisplay has proved a–difficult task, despite a frequentrequirement for this process. HOMED enables related sequencesto be edited and listed in parallel with each other. The editorfunction uses a full screen editor which emulates the text editorsKED and EDT (on PDP–11 and VAX–11 respectively)and which can be adapted to emulate other text editors. Thisemulation has been adopted to simplify user learning of editingfunctions. HOMED provides functions for listing the sequencesin a variety of formats and for generating a consensus sequenceas well as providing a series of tools for maintenance of thesequence database. HOMED has been implemented in Pascal in amodular fashion to enhance portability. Received on November 27, 1986; accepted on January 8, 1987  相似文献   

10.
We introduce the PSSH ('Protein Sequence-to-Structure Homologies') database derived from HSSP2, an improved version of the HSSP ('Homology-derived Secondary Structure of Proteins') database [Dodge et al. (1998) Nucleic Acids Res., 26, 313-315]. Whereas each HSSP entry lists all protein sequences related to a given 3D structure, PSSH is the 'inverse', with each entry listing all structures related to a given sequence. In addition, we introduce two other derived databases: HSSPchain, in which each entry lists all sequences related to a given PDB chain, and HSSPalign, in which each entry gives details of one sequence aligned onto one PDB chain. This re-organization makes it easier to navigate from sequence to structure, and to map sequence features onto 3D structures. Currently (September 2002), PSSH provides structural information for over 400 000 protein sequences, covering 48% of SWALL and 61% of SWISS-PROT sequences; HSSPchain provides sequence information for over 25 000 PDB chains, and HSSPalign gives over 14 million sequence-to-structure alignments. The databases can be accessed via SRS 3D, an extension to the SRS system, at http://srs3d.ebi.ac.uk/.  相似文献   

11.
The recent electronmicroscopic and biochemical mapping of Z-DNA sites in phi X174, SV40, pBR322 and PM2 DNAs has been used to determine two sets of criteria for identification of potential Z-DNA sequences in natural DNA genomes. The prediction of potential Z-DNA tracts and corresponding statistical analysis of their occurrence have been made on a sample of 14 DNA genomes. Alternating purine and pyrimidine tracts longer than 5 base pairs in length and their clusters (quasi alternating fragments) in the 14 genomes studied are under-represented compared to the expectation from corresponding random sequences. The fragments [d(G X C)]n and [d(C X G)]n (n greater than or equal to 3) in general do not occur in circular DNA genomes and are under-represented in the linear DNAs of phages lambda and T7, whereas in linear genomes of adenoviruses they are strongly over-represented. With minor exceptions, potential Z-DNA sites are also under-represented compared to random sequences. In the 14 genomes studied, predicted Z-DNA tracts occur in non-coding as well as in protein coding regions. The predicted Z-DNA sites in phi X174, SV40, pBR322 and PM2 correspond well with those mapped experimentally. A complete listing together with a compact graphical representation of alternating purine-pyrimidine fragments and their Z-forming potential are presented.  相似文献   

12.
13.
Diversity of Heterolobosea (Excavata) in environments is poorly understood despite their ecological occurrence and health-associated risk, partly because this group tends to be under-covered by most universal eukaryotic primers used for sequencing. To overcome the limits of the traditional morpho-taxonomy-based biomonitoring, we constructed a primer database listing existing and newly designed specific primer pairs that have been evaluated for Heterolobosea 18S rRNA sequencing. In silico taxonomy performance against the current SILVA SSU database allowed the selection of primer pairs that were next evaluated on reference culture amoebal strains. Two primer pairs were retained for monitoring the diversity of Heterolobosea in freshwater environments, using high-throughput sequencing. Results showed that one of the newly designed primer pairs allowed species-level identification of most heterolobosean sequences. Such primer pair could enable informative, cultivation-free assays for characterizing heterolobosean populations in various environments.  相似文献   

14.
15.
16.
Genomic scrap yard: how genomes utilize all that junk   总被引:14,自引:0,他引:14  
Makałowski W 《Gene》2000,259(1-2):61-67
  相似文献   

17.
The SV40 T antigen database is a listing of plasmids and/or viruses that express mutant forms of the virus-encoded large T antigen protein. The parental virus strain, nucleic acid sequence of the mutations, the effect of the mutation on the T antigen amino acid sequence, and key references are included in the listing. The database is available from the authors as a Macintosh FileMaker Pro file, and as a hard copy printout.  相似文献   

18.
A comprehensive and up-to-date listing is provided of the distribution of phenethylamines in the Plant Kingdom. Such a listing is of importance because of their considerable physiological activity in higher animals. Their distribution in plants is also of some taxonomic interest.  相似文献   

19.
We present SequenceMatrix, software that is designed to facilitate the assembly and analysis of multi‐gene datasets. Genes are concatenated by dragging and dropping FASTA, NEXUS, or TNT files with aligned sequences into the program window. A multi‐gene dataset is concatenated and displayed in a spreadsheet; each sequence is represented by a cell that provides information on sequence length, number of indels, the number of ambiguous bases (“Ns”), and the availability of codon information. Alternatively, GenBank numbers for the sequences can be displayed and exported. Matrices with hundreds of genes and taxa can be concatenated within minutes and exported in TNT, NEXUS, or PHYLIP formats, preserving both character set and codon information for TNT and NEXUS files. SequenceMatrix also creates taxon sets listing taxa with a minimum number of characters or gene fragments, which helps assess preliminary datasets. Entire taxa, whole gene fragments, or individual sequences for a particular gene and species can be excluded from export. Data matrices can be re‐split into their component genes and the gene fragments can be exported as individual gene files. SequenceMatrix also includes two tools that help to identify sequences that may have been compromised through laboratory contamination or data management error. One tool lists identical or near‐identical sequences within genes, while the other compares the pairwise distance pattern of one gene against the pattern for all remaining genes combined. SequenceMatrix is Java‐based and compatible with the Microsoft Windows, Apple MacOS X and Linux operating systems. The software is freely available from http://code.google.com/p/sequencematrix/ . © The Willi Hennig Society 2010.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号