首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
An improved sequence handling package that runs on the Apple Macintosh   总被引:4,自引:0,他引:4  
We report improvements to our sequence analysis package andadaptation to run on the Apple Macintosh range of machines.The ‘standard’ version of the programs, which runon a VAX, has been given a new user interface that makes theprograms very much easier to work with and has facilitated themove to the Macintosh. The reorganization of the code shouldsimplify moves to other systems that offer WIMP user interfaces.In addition to a large number of small but useful extra features,some important new analytical functions have been devised. Theseinclude sequence and contig editors; optimal alignment and comparisonmethods; and a new method for comparing the observed and expectedfrequencies of selected oligonucleotides. Received on February 12, 1990; accepted on April 19, 1990  相似文献   

2.
In previous work, we have shown that a set of characteristics,defined as (code frequency) pairs, can be derived from a proteinfamily by the use of a signal-processing method. This methodenables the location and extraction of sequence patterns bytaking into account each (code frequency) pair individually.In the present paper, we propose to extend this method in orderto detect and visualize patterns by taking into account severalpairs simultaneously. Two ‘multifrequency’ methodsare described. The first one is based on a rewriting of thesequences with new symbols which summarize the frequency information.The second method is based on a clustering of the patterns associatedwith each pair. Both methods lead to the definition of significantconsensus sequences. Some results obtained with calcium-bindingproteins and serine proteases are also discussed. Received on March 6, 1990; accepted on September 24, 1990  相似文献   

3.
A space-efficient algorithm for local similarities   总被引:3,自引:0,他引:3  
Existing dynamic-programming algorithms for identifying similarregions of two sequences require time and space proportionalto the product of the sequence lengths. Often this space requirementis more limiting than the time requirement. We describe a dynamic-programminglocal-similarity algorithm that needs only space proportionalto the sum of the sequence lengths. The method can also findrepeats within a single long sequence. To illustrate the algorithm'spotential, we discuss comparison of a 73 360 nucleotide sequencecontaining the human ß-like globin gene cluster anda corresponding 44 594 nucleotide sequence for rabbit, a problemwell beyond the capabilities of other dynamic-programming software. Received on January 29, 1990; accepted on May 30, 1990  相似文献   

4.
5.
The gene for the light-harvesting chlorophyll a/b binding proteinfrom rice was cloned and se-quenced. The clone contains a 798-bpcoding sequence, which is identical to that of a cDNA for typeI LHCPII (Matsuoka 1990), and its 5'- and 3'-flanking regions.The coding region of this gene is not interrupted by interveningsequences, as reported for type I genes from other plants. Inthe 5'-flanking region, typical TATA and CAAT boxes are located30 and 92 bp upstream from the capping site (positions –30and –92), respectively. A putative phytochrome-responsiveelement (AAGATAAGG) is located at position –65 betweenthe TATA and CAAT boxes. Comparison of sequences in the 5'-flankingregions between this gene and genes for LHCPII from other gramineousplants indicates that the rice sequence has no apparent homologyto that of wheat. However, the rice sequence is highly homologousto the maize sequence, not only around the TATA and CAAT boxesbut also in regions further upstream. To investigate the promoter activity of the 5'-flanking regionof the gene, a chimeric gene was constructed by fusing the 5'-flankingregion to the coding sequence for ß-glucuronidase(GUS), and this chimeric gene was introduced into tobacco. Thehighest activity of GUS was observed in leaf tissue, indicatingthat the 5'-flanking region of the gene can act as a promoterin an organ-specific manner in tobacco. Histochemical analysisin situ was also performed to determine where GUS activity wasexpressed. The highest activity was found in leaf mesophyllcells. High activity was also observed in the vascular systemof stems and petioles, and low activity was found in root tissue. (Received August 20, 1990; Accepted January 21, 1991)  相似文献   

6.
An algorithm, ‘phylogenetic scanning’, is describedfor mapping gene conversion events where comparative DNA sequencedata are available from different species. In this algorithm,sets of hypothetical phylogenetic trees are constructed thatdescribe possible sequence relationships due to gene conversionsin different species lineages; these trees are then evaluatedby the principle of parsimony at intervals in the sequence alignment.When used to map gene conversion events that occurred betweenthe pair of -globin genes of higher primates, the algorithmgives results nearly identical to those obtained using a tediousmanual approach. Suggestions are also provided for adaptationof this procedure to the analysis of other recombination events. Received on July 3, 1990; accepted on November 8, 1990  相似文献   

7.
We have written two programs for searching biological sequencedatabases that run on Intel hypercube computers. PSCANLJB comparesa single sequence against a sequence library, and PCOMPLIB comparesall the entries in one sequence library against a second library.The programs provide a general framework for similarity searching;they include functions for reading in query sequences, searchparameters and library entries, and reporting the results ofa search. We have isolated the code for the specific functionthat calculates the similarity score between the query and librarysequence; alternative searching algorithms can be implementedby editing two files. We have implemented the rapid FASTA sequencecomparison algorithm and the more rigorous Smith — Watermanalgorithm within this framework. The PSCANLIB program on a 16node iPSC/2 80386-based hypercube can compare a 229 amino acidprotein sequence with a 3.4 million residue sequence libraryin {small tilde}16s with the FASTA algorithm. Using the Smith— Waterman algorithm, the same search takes 35 min. ThePCOMPUB program can compare a 0.8 millon amino acid proteinsequence library with itself in 5.3 min with FASTA on a third-generation32 node Intel iPSC/860 hypercube. Received on September 8, 1990; accepted on December 15, 1990  相似文献   

8.
The advent of the complete genome sequences of various organisms in the mid-1990s raised the issue of how one could determine the function of hypothetical proteins. While insight might be obtained from a 3D structure, the chances of being able to predict such a structure is limited for the deduced amino acid sequence of any uncharacterized gene. A template for modeling is required, but there was only a low probability of finding a protein closely-related in sequence with an available structure. Thus, in the late 1990s, an international effort known as structural genomics (SG) was initiated, its primary goal to “fill sequence-structure space” by determining the 3D structures of representatives of all known protein families. This was to be achieved mainly by X-ray crystallography and it was estimated that at least 5,000 new structures would be required. While the proteins (genes) for SG have subsequently been derived from hundreds of different organisms, extremophiles and particularly thermophiles have been specifically targeted due to the increased stability and ease of handling of their proteins, relative to those from mesophiles. This review summarizes the significant impact that extremophiles and proteins derived from them have had on SG projects worldwide. To what extent SG has influenced the field of extremophile research is also discussed.  相似文献   

9.
Reconsideration of the term “gene” should take into account (a) the potential clash between hierarchical levels of information discussed in the 1970s by Gregory Bateson, (b) the contrast between conventional and genome phenotypes discussed in the 1980s by Richard Grantham, and (c) the emergence in the 1990s of a new science—Evolutionary Bioinformatics—that views genomes as channels conveying multiple forms of information through the generations. From this perspective, there is conceptual continuity between the functional “gene” of Mendel and today’s GenBank sequences. If the function attributed to a gene can change specifically as the result of a DNA mutation, then the mutated part of DNA can be considered as part of the gene. Conversely, even if appearing to locate within a gene, a mutation that does not change the specific function is not part of the gene, although it may change some other function to which the DNA sequence contributes. This strict definition is impractical, but serves as a guide to more workable, context-dependent, definitions. The gene is either (1) The DNA sequence that is transcribed, (2) The latter plus the immediate 5′ and 3′ sequences that, when mutated, specifically affect the function, (3) The latter two, plus any remote sequences that, when mutated, specifically affect the function. Attempts, such as that of Scherrer and Jost, to redefine Mendel’s “gene,” may be too narrowly focused on regulation to the exclusion of other important themes.  相似文献   

10.
Genomics is the study of an organism’s entire genome. It started out as a great scientific endeavor in the 1990s which aimed to sequence the complete genomes of certain biological species. However viruses are not new to this field as complete viral genomes have routinely been sequenced since the past thirty years. The ‘genomic era’ has been said to have revolutionized biology. This knowledge of full genomes has created the field of functional genomics in today’s post-genomic era, which, is in most part concerned with the studies on the expression of the organism’s genome under different conditions. This article is an attempt to introduce its readers to the application of functional genomics to address and answer several complex biological issues in virus research.  相似文献   

11.
In this paper, we present methods to detect and localize patternsin biologically related protein sequences (family). The patternscommon to the sequences of the family are detected by usingFourier analysis. No previous scales (codes) are needed, theyare actually produced as a result of the analysis procedure,together with the frequencies of the Fourier decompositions.Characteristic features of the family are thus expressed as(code–frequency) pairs. Various tools are proposed inorder to localize the patterns, to compare the codes, and toevaluate the proximity of an arbitrary sequence to the investigatedfamily. The general strategy is illustrated on a family composedof proteins Received on October 17, 1989; accepted on January 16, 1990  相似文献   

12.
To date all attempts to derive a phyletic relationship among restriction endonucleases (ENases) from multiple sequence alignments have been limited by extreme divergence of these enzymes. Based on the approach of Johnson et al. (1990), I report for the first time the evolutionary tree of the ENase-like protein superfamily inferred from quantitative comparison of atomic coordinates of structurally characterized enzymes. The results presented are in harmony with previous comparisons obtained by crystallographic analyses. It is shown that λ-exonuclease initially diverged from the common ancestor and then two ``endonucleolytic' families branched out, separating ``blunt end cutters' from ``5′ four-base overhang cutters.' These data may contribute to a better understanding of ENases and encourage the use of structure-based methods for inference of phylogenetic relationship among extremely divergent proteins. In addition, the comparison of three-dimensional structures of ENase-like domains provides a platform for further clustering analyses of sequence similarities among different branches of this large protein family, rational choice of homology modeling templates, and targets for protein engineering. Received: 14 June 1999 / Accepted: 11 August 1999  相似文献   

13.
The programme pscan has been developed to distribute proteindatabank scans over a network of computers that share a commonfilesystem. pscan may be used in conjunction with most conventionalsequence comparison programmes with few modifications. In testruns using the Smith — Waterman dynamic programming algorithm,the time required to scan a 6858 sequence databank using a querysequence 740 residues long was reduced from 50 min for a singleprocessor, to 11 minutes for five processors. Accordingly, pscanprovides a low-cost, portable alternative to dedicated parallelprocessing computers. Received on August 27, 1990; accepted on September 25, 1990  相似文献   

14.
Screening cultures of nonpathogenic microorganisms led us to a glutamic-acid-specific endopeptidase from Bacillus subtilis ATCC 6051, which we purified and named BSase. The nucleotide sequence encoding BSase, with a molecular mass of 23 894 Da, completely agreed with that of the mpr gene, which had been reported by Rufo Jr. and Sloma et al. to encode a metalloprotease [J Bacteriol (1990) 172:1019–1023 and 1024–1029 respectively]. However, enzymatic characterization revealed it to have the catalytic triad of a serine protease and not the consensus sequence of a metalloprotease, and it was inhibited by diisopropylfluorophosphate. We therefore consider BSase (mpr) to be a serine protease. In the alignment of the acidic-amino-acid-specific proteases, the proteases from bacilli have a highly conserved histidine residue, which is most important in the histidine triad in the proteases from streptomycetes. Furthermore, Ca2+ was necessary for its activity and stability. BSase cleaved the C-terminal glutamic acid with high specificity and was very stable over a wide pH range. On the basis of these properties, we tried to retrieve a bioactive peptide from a fusion protein by sequence-specific digestion, and succeeded in obtaining the bioactive peptide. BSase was found to be very useful as a tool for selective cleavage. Received: 24 December 1996 / Received revision: 3 February 1997 / Accepted: 22 February 1997  相似文献   

15.
Here we present a performance test of a Kohonen features mapapplied to the fast extraction of uncommon sequences from thecoding region of the human insulin receptor gene. We used anetwork with 30 neurons and with a variable input window. Theprogram was aimed at detecting unique or uncommon DNA regionspresent in crude sequence data and was able to automaticallydetect the signal peptide coding regions of a set of human insulinreceptor gene data. The testing of this program with HSIRPRcDNA release (EMBL data bank) indicated the presence of uniquefeatures in the signal peptide coding region. On the basis ofour results this program can automatically detect ‘singularity’from crude sequencing data and it does not require knowledgeof the features to be found. Received on August 27, 1990; accepted on March 14, 1991  相似文献   

16.
This paper presents a method for the multiple alignment of asequence set. The MASH algorithm uses a non-redundant databaseof common motifs and an ‘alignment priority’ criterionthat depends on the length and the occurrence frequency of thepatterns in the set of sequences. This user-defined criterionallows the determination of the series of the patterns to bealigned. This program is applied to a fragment of envelope geneenv gp120 for 20 isolates of the immunodeficiency virus. Themultiplicity of alignments obtained by modifying the criterionparameters reveals different aspects of similarity between thesequences. Received on June 4, 1990; accepted on December 14, 1990  相似文献   

17.
徐瑶  陈涛 《生态学报》2016,36(16):5078-5087
藏北草地是我国重要的畜牧业生产基地和生态安全屏障。基于遥感和GIS技术,利用1990、2000、2010年3期不同时相的TM、ETM+和CEBERS遥感影像,对申扎县草地资源退化状况进行了遥感监测,并采用生态经济学评估模型对草地生态系统8个方面的服务功能价值损失进行了评估测算。结果表明:1990—2010年,申扎县草地退化面积增加了47.40×10~4hm~2,生态系统服务功能价值损失高达5.20×10~8元;其中1990—2000年,草地退化较严重,该时段也是生态系统服务功能价值损失较多的时期;2000—2010年,草地退化趋势变缓。藏北草地提供生物量价值仅约占生态系统服务功能总价值的7.0%,草地生态服务功能远大于其提供的生物量价值,因此必须从生态服务功能的的理念出发去经营草地,从而实现草地的可持续发展。  相似文献   

18.
The importance of viruses as model organisms is well-established in molecular biology and Max Delbrück’s phage group set standards in the DNA phage field. In this paper, I argue that RNA phages, discovered in the 1960s, were also instrumental in the making of molecular biology. As part of experimental systems, RNA phages stood for messenger RNA (mRNA), genes and genome. RNA was thought to mediate information transfers between DNA and proteins. Furthermore, RNA was more manageable at the bench than DNA due to the availability of specific RNases, enzymes used as chemical tools to analyse RNA. Finally, RNA phages provided scientists with a pure source of mRNA to investigate the genetic code, genes and even a genome sequence. This paper focuses on Walter Fiers’ laboratory at Ghent University (Belgium) and their work on the RNA phage MS2. When setting up his Laboratory of Molecular Biology, Fiers planned a comprehensive study of the virus with a strong emphasis on the issue of structure. In his lab, RNA sequencing, now a little-known technique, evolved gradually from a means to solve the genetic code, to a tool for completing the first genome sequence. Thus, I follow the research pathway of Fiers and his ‘RNA phage lab’ with their evolving experimental system from 1960 to the late 1970s. This study illuminates two decisive shifts in post-war biology: the emergence of molecular biology as a discipline in the 1960s in Europe and of genomics in the 1990s.  相似文献   

19.
Apoproteins of spinach and pea light-harvesting chlorophylla/b complexes associated with photosystem I (LHCI) were identifiedby their chlorophyll fluorescence spectra and protein sequences.Spinach LHCI holocomplex consisted of four apoproteins of 25kDa, 23 kDa, 21 kDa and 20.5 kDa. LHCI subcomplex isolated bysucrose density gradient centrifugation fluoresced at 680 nmwith a shoulder around 700–710 nm at 77 K. It containedthe 23 kDa protein of which the N-terminal sequence correspondedto Type II gene of LHCI. Another LHCI subcomplex isolated bygel electrophoresis emitted at 679 nm and contained the 25 kDaprotein, of which the N-terminus was blocked. Its internal sequenceswere determined after protease treatment and found to be homologousto Type III gene of LHCI. An oligomeric subcomplex of LHCI isolatedby gel electrophoresis emitted at 726 nm and consisted of the21 kDa and 20.5 kDa apoproteins. N-terminal sequence of the20.5 kDa component corresponded to the Type I gene of LHCI.The 21 kDa component did not have any clear homologue, but itsN-terminal sequence was weakly but significantly homologousto all LHC components particularly to Type I LHCI among others.It was, thus, concluded that the 21 kDa protein is the fourthtype of LHCI apoprotein. Similar sequence homology was foundfor pea LHCI apoproteins. (Received September 10, 1990; Accepted November 22, 1990)  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号