首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
Macintosh sequence analysis software   总被引:3,自引:0,他引:3  
The analysis of information in nucleotide and amino acid sequence data from an investigator’s own laboratory, or from the ever-growing worldwide databases, is critically dependent on well planned and written software. Although the most powerful packages previously have been confined to workstations, there has been a dramatic increase over the last few years in the sophistication of the programs available for personal computers, as the speed and power of these have increased. A wide choice of software is available for the Macintosh, including the LaserGene suite of programs from DNAStar. This review assesses the strengths and weaknesses of LaserGene and concludes that it provides a useful and comprehensive range of sequence analysis tools.  相似文献   

2.
A computer program, which runs on MS-DOS personal computers, is described that assists in the design of synthetic genes coding for proteins. The goal of the program is the design of a gene which (i) contains as many unique restriction sites as possible and (ii) uses a specific codon usage. The gene designed according to the criteria above is (i) suitable for 'modular mutagenesis' experiments and (ii) optimized for expression. The program 'reverse-translates' protein sequences into degenerated DNA sequences, generates a map of potential restriction sites and locates sequence positions where unique restriction sites can be accommodated. The nucleic acid sequence is then 'refined' according to a specific codon usage to remove any degeneration. Unique restriction sites, if potentially present, can be 'forced' into the degenerated nucleic acid sequence by using 'priority codes' assigned to different restriction sequences.  相似文献   

3.
A computer software package called 'FasParser' was developed for manipulating sequence data.It can be used on personal computers to perform series of analyses,including counting and viewing differences between two sequences at both DNA and codon levels,identifying overlapping regions between two alignments,sorting of sequences according to their IDs or lengths,concatenating sequences of multiple loci for a particular set of samples,translating nucleotide sequences to amino acids,and constructing alignments in several different formats,as well as some extracting and filtrating of data for a particular FASTA file.Majority of these functions can be run in a batch mode,which is very useful for analyzing large data sets.This package can be used by a broad audience,and is designed for researchers that do not have programming experience in sequence analyses.The GUI version of FasParser can be downloaded from https://github.com/Sun-Yanbo/FasParser,free of charge.  相似文献   

4.
We describe a program which may be used to find approximate matches to a short predefined DNA sequence in a larger target DNA sequence. The program predicts the usefulness of specific DNA probes and sequencing primers and finds nearly identical sequences that might represent the same regulatory signal. The program is written in the C programming language and will run on virtually any computer system with a C compiler, such as the IBM/PC and other computers running under the MS/DOS and UNIX operating systems. The program has been integrated into an existing software package for the IBM personal computer (see article by Mount and Conrad, this volume). Some examples of its use are given.  相似文献   

5.
Insertional mutagenesis is a powerful tool for generating knockout mutations that facilitate associating biological functions with as yet uncharacterized open reading frames (ORFs) identified by genomic sequencing or represented in EST databases. We have generated a collection of Dissociation (Ds) transposon lines with insertions on all 5 Arabidopsis chromosomes. Here we report the insertion sites in 260 independent single-transposon lines, derived from four different Ds donor sites. We amplified and determined the genomic sequence flanking each transposon, then mapped its insertion site by identity of the flanking sequences to the corresponding sequence in the Arabidopsis genome database. This constitutes the largest collection of sequence-mapped Ds insertion sites unbiased by selection against the donor site. Insertion site clusters have been identified around three of the four donor sites on chromosomes 1 and 5, as well as near the nucleolus organizers on chromosomes 2 and 4. The distribution of insertions between ORFs and intergenic sequences is roughly proportional to the ratio of genic to intergenic sequence. Within ORFs, insertions cluster near the translational start codon, although we have not detected insertion site selectivity at the nucleotide sequence level. A searchable database of insertion site sequences for the 260 transposon insertion sites is available at http://sgio2.biotec.psu.edu/sr. This and other collections of Arabidopsis lines with sequence-identified transposon insertion sites are a valuable genetic resource for functional genomics studies because the transposon location is precisely known, the transposon can be remobilized to generate revertants, and the Ds insertion can be used to initiate further local mutagenesis.  相似文献   

6.
Position-specific substitution matrices, known as profiles,derived from multiple sequence alignments are currently usedto search sequence databases for distantly related members ofprotein families. The performance of the database searches isenhanced by using (i) a sequence weighting scheme which assignshigher weights to more distantly related sequences based onbranch lengths derived from phylogenetic trees, (ii) exclusionof positions with mainly padding characters at sites of insertionsor deletions and (iii) the BLOSUM62 residue comparison matrix.A natural consequence of these modifications is an improvementin the alignment of new sequences to the profiles. However,the accuracy of the alignments can be further increased by employinga similarity residue comparison matrix. These developments areimplemented in a program called PROFILEWEIGHT which runs onUnix and Vax computers. The only input required by the programis the multiple sequence alignment. The output from PROFILEWEIGHTis a profile designed to be used by existing searching and alignmentprograms. Test results from database searches with four differentfamilies of proteins show the improved sensitivity of the weightedprofiles.  相似文献   

7.
8.
Insertional mutagenesis is a powerful tool for generating knockout mutations that facilitate associating biological functions with as yet uncharacterized open reading frames (ORFs) identified by genomic sequencing or represented in EST databases. We have generated a collection of Dissociation(Ds) transposon lines with insertions on all 5 Arabidopsischromosomes. Here we report the insertion sites in 260 independent single-transposon lines, derived from four different Ds donor sites. We amplified and determined the genomic sequence flanking each transposon, then mapped its insertion site by identity of the flanking sequences to the corresponding sequence in the Arabidopsisgenome database. This constitutes the largest collection of sequence-mapped Ds insertion sites unbiased by selection against the donor site. Insertion site clusters have been identified around three of the four donor sites on chromosomes 1 and 5, as well as near the nucleolus organizers on chromosomes 2 and 4. The distribution of insertions between ORFs and intergenic sequences is roughly proportional to the ratio of genic to intergenic sequence. Within ORFs, insertions cluster near the translational start codon, although we have not detected insertion site selectivity at the nucleotide sequence level. A searchable database of insertion site sequences for the 260 transposon insertion sites is available at http://sgio2.biotec.psu.edu/sr. This and other collections of Arabidopsislines with sequence-identified transposon insertion sites are a valuable genetic resource for functional genomics studies because the transposon location is precisely known, the transposon can be remobilized to generate revertants, and the Ds insertion can be used to initiate further local mutagenesis.  相似文献   

9.
序列同源性分析软件Blast的WEB界面构建及其应用   总被引:5,自引:1,他引:4  
基于局域网(Intranet)内的PC/Linux服务器, 构建了序列同源性分析软件Blast的WEB界面. 局域网内的所有计算机均可通过WEB方式访问该服务器进行公共数据库和自建数据库的查询,具有保密、高效、免费的优点,能够满足实验室和研究院所的大规模、快速数据分析任务.  相似文献   

10.
11.
M Nanard  J Nanard 《Biochimie》1985,67(5):429-432
Learning methods developed by artificial intelligence research teams are very efficient for biological sequences analysis but they need running on large computers accessed by terminals. These computers are interfaced with standard displays involving long and unpleasant alphanumerical data handling. The "biological work station" is a personal computer with a color graphic screen providing a user-friendly interface for the artificial intelligence learning programs running on large computers. It provides to biologist a graphical convenient tool for sequence analysis built with efficient man-machine communication methods such as multiwindows, icons and mouse selection. It allows the biologist to edit and display sequences in an efficient and natural way, showing off directly on color pictures the data and the results of learning programs.  相似文献   

12.
DNA Strider is a new integrated DNA and Protein sequence analysis program written with the C language for the Macintosh Plus, SE and II computers. It has been designed as an easy to learn and use program as well as a fast and efficient tool for the day-to-day sequence analysis work. The program consists of a multi-window sequence editor and of various DNA and Protein analysis functions. The editor may use 4 different types of sequences (DNA, degenerate DNA, RNA and one-letter coded protein) and can handle simultaneously 6 sequences of any type up to 32.5 kB each. Negative numbering of the bases is allowed for DNA sequences. All classical restriction and translation analysis functions are present and can be performed in any order on any open sequence or part of a sequence. The main feature of the program is that the same analysis function can be repeated several times on different sequences, thus generating multiple windows on the screen. Many graphic capabilities have been incorporated such as graphic restriction map, hydrophobicity profile and the CAI plot- codon adaptation index according to Sharp and Li. The restriction sites search uses a newly designed fast hexamer look-ahead algorithm. Typical runtime for the search of all sites with a library of 130 restriction endonucleases is 1 second per 10,000 bases. The circular graphic restriction map of the pBR322 plasmid can be therefore computed from its sequence and displayed on the Macintosh Plus screen within 2 seconds and its multiline restriction map obtained in a scrolling window within 5 seconds.  相似文献   

13.
Functional constraints to modifications in triterpene cyclase amino acid sequences make them good candidates for evolutionary studies on the phylogenetic relatedness of these enzymes in prokaryotes as well as in eukaryotes. In this study, we used a set of identified triterpene cyclases, a group of mainly bacterial squalene cyclases and a group of predominantly eukaryotic oxidosqualene cyclases, as seed sequences to identify 5288 putative triterpene cyclase homologues in publicly available databases. The Cluster Analysis of Sequences software was used to detect groups of sequences with increased pairwise sequence similarity. The sequences fall into two main clusters, a bacterial and a eukaryotic. The conserved, informative regions of a multiple sequence alignment of the family were used to construct a neighbour-joining phylogenetic tree using the AsaturA and maximum likelihood phylogenetic tree using the PhyML software. Both analyses showed that most of the triterpene cyclase sequences were similarly grouped to the accepted taxonomic relationships of the organism the sequences originated from, supporting the idea of vertical transfer of cyclase genes from parent to offspring as the main evolutionary driving force in this protein family. However, a small group of sequences from three bacterial species ( Stigmatella , Gemmata and Methylococcus ) grouped with an otherwise purely eukaryotic cluster of oxidosqualene cyclases, while a small group of sequences from seven fungal species and a sequence from the fern Adiantum grouped consistently with a cluster of otherwise purely bacterial squalene cyclases. This suggests that lateral gene transfer may have taken place, entailing a transfer of oxidosqualene cyclases from eukaryotes to bacteria and a transfer of squalene cyclase from bacteria to an ancestor of the group of Pezizomycotina fungi.  相似文献   

14.
A total of 1000 expressed sequence tags (ESTs) corresponding to 760 unique sequence sets were identified using random sequencing of clones from a cDNA library constructed from mycelial RNA of Phytophthora infestans. A number of software programs, represented by a relational database and an analysis pipeline, were developed for the automated analysis and storage of the EST sequence data. A set of 419 nonredundant sequences, which correspond to a total of 632 ESTs (63.2%), were identified as showing significant matches to sequences deposited in public databases. A putative cellular identity and role was assigned to all 419 sequences. All major functional categories were represented by at least several ESTs. Four novel cDNAs containing sequences related to elicitins, a family of structurally related proteins that induce the hypersensitive response and condition avirulence of P. infestans on Nicotiana plants, were among the most notable genes identified. Two of these elicitin-like cDNAs were among the most abundant cDNAs examined. The set also contained several ESTs with high sequence similarity to unique plant genes.  相似文献   

15.
Serum albumin (SA) is a circulating protein providing a depot and carrier for many endogenous and exogenous compounds. At least seven major binding sites have been identified by structural and functional investigations mainly in human SA. SA is conserved in vertebrates, with at least 49 entries in protein sequence databases. The multiple sequence analysis of this set of entries leads to the definition of a cladistic tree for the molecular evolution of SA orthologs in vertebrates, thus showing the clustering of the considered species, with lamprey SAs (Lethenteron japonicum and Petromyzon marinus) in a separate outgroup. Sequence analysis aimed at searching conserved domains revealed that most SA sequences are made up by three repeated domains (about 600 residues), as extensively characterized for human SA. On the contrary, lamprey SAs are giant proteins (about 1400 residues) comprising seven repeated domains. The phylogenetic analysis of the SA family reveals a stringent correlation with the taxonomic classification of the species available in sequence databases. A focused inspection of the sequences of ligand binding sites in SA revealed that in all sites most residues involved in ligand binding are conserved, although the versatility towards different ligands could be peculiar of higher organisms. Moreover, the analysis of molecular links between the different sites suggests that allosteric modulation mechanisms could be restricted to higher vertebrates.  相似文献   

16.
Identification and characterization of new plant microRNAs using EST analysis   总被引:50,自引:0,他引:50  
Seventy-five previously known plant microRNAs (miRNAs) were classified into 14 families according to their gene sequence identity. A total of 18,694 plant expressed sequence tags (EST) were found in the GenBank EST databases by comparing all previously known Arabidopsis miRNAs to GenBank‘s plant EST databases with BLAST algorithms. After removing the EST sequences with high numbers (more than 2) of mismatched nucleotides, a total of 812 EST contigs were identified. After predicting and scoring the RNA secondary structure of the 812 EST sequences using mFold software, 338 new potential miRNAs were identified in 60 plant species, miRNAs are widespread. Some microRNAsmay highly conserve in the plant kingdom, and they may have the same ancestor in very early evolution. There is no nucleotide substitution in most miRNAs among many plant species. Some of the new identified potential miRNAs may be induced and regulated by environmental biotic and abiotic stresses. Some may be preferentially expressed in specific tissues, and are regulated by developmental switching. These findings suggest that EST analysis is a good alternative strategy for identifying new miRNA candidates, their targets, and other genes. A large number of miRNAs exist in different plant species and play important roles in plant developmental switching and plant responses to environmental abiotic and biotic stresses as well as signal transduction. Environmental stresses and developmental switching may be the signals for synthesis and regulation of miRNAs in plants. A model for miRNA induction and expression, and gene regulation by miRNA is hypothesized.  相似文献   

17.
We present a web service allowing to automatically assign sequences to homologous gene families from a set of databases. After identification of the most similar gene family to the query sequence, this sequence is added to the whole alignment and the phylogenetic tree of the family is rebuilt. Thus, the phylogenetic position of the query sequence in its gene family can be easily identified. AVAILABILITY: http://pbil.univ-lyon1.fr/software/HoSeqI/.  相似文献   

18.
19.
Extracting the desired data from a database entry for later analysis is a constant need in the biological sequence analysis community; GeneRecords 1.0 is a solution for GenBank biological flat file parsing, as it implements a structured representation of each feature and feature qualifier in GenBank following import in a common database managing system usable in a personal computer (Macintosh and Windows environments). This collection of related databases enables the local management of GenBank records, allowing indexing, retrieval and analysis of both information and sequences on a personal computer. AVAILABILITY: The current release, including the FileMaker Pro runtime application (built for Windows and Macintosh environments), is freely available at http://apollo11.isto.unibo.it/software/  相似文献   

20.
The internal transcribed spacer (ITS) region of the nuclear ribosomal repeat unit holds a central position in the pursuit of the taxonomic affiliation of fungi recovered through environmental sampling. Newly generated fungal ITS sequences are typically compared against the International Nucleotide Sequence Databases for a species or genus name using the sequence similarity software suite blast . Such searches are not without complications however, and one of them is the presence of chimeric entries among the query or reference sequences. Chimeras are artificial sequences, generated unintentionally during the polymerase chain reaction step, that feature sequence data from two (or possibly more) distinct species. Available software solutions for chimera control do not readily target the fungal ITS region, but the present study introduces a blast -based open source software package (available at http://www.emerencia.org/chimerachecker.html ) to examine newly generated fungal ITS sequences for the presence of potentially chimeric elements in batch mode. We used the software package on a random set of 12 300 environmental fungal ITS sequences in the public sequence databases and found 1.5% of the entries to be chimeric at the ordinal level after manual verification of the results. The proportion of chimeras in the sequence databases can be hypothesized to increase as emerging sequencing technologies drawing from pooled DNA samples are becoming important tools in molecular ecology research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号