首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
SBASE 5.0 is the fifth release of SBASE, a collection of annotated protein domain sequences that represent various structural, functional, ligand-binding and topogenic segments of proteins. SBASE was designed to facilitate the detection of functional homologies and can be searched with standard database-search programs. The present release contains over 79863 entries provided with standardized names and is cross-referenced to all major sequence databases and sequence pattern collections. The information is assigned to individual domains rather than to entire protein sequences, thus SBASE contains substantially more cross-references and links than do the protein sequence databases. The entries are clustered into >16 000 groups in order to facilitate the detection of distant similarities. SBASE 5.0 is freely available by anonymous 'ftp' file transfer from <ftp.icgeb.trieste.it >. Automated searching of SBASE with BLAST can be carried out with the WWW-server <http://www.icgeb.trieste.it/sbase/ >. and with the electronic mail server <sbase@icgeb.trieste.it >which now also provides a graphic representation of the homologies. A related WWW-server <http://www.abc.hu/blast.html > and e-mail server <domain@hubi.abc.hu > predicts SBASE domain homologies on the basis of SWISS-PROT searches.  相似文献   

2.
SBASE 4.0 is the fourth release of SBASE, a collection of annotated protein domain sequences that represent various structural, functional, ligand binding and topogenic segments of proteins. SBASE was designed to facilitate the detection of functional homologies and can be searched with standard database search tools, such as FASTA and BLAST3. The present release contains 61 137 entries provided with standardized names and cross-referenced to all major protein, nucleic acid and sequence pattern collections. The entries are clustered into 13 155 groups in order to facilitate detection of distant similarities. SBASE 4.0 is freely available by anonymous ftp file transfer from ftp.icgeb.trieste.it. Individual records can be retrieved with the gopher server at icgeb.trieste.it and with a World Wide Web server at http://www.icgeb.trieste.it. Automated searching of SBASE with BLAST can be carried out with the electronic mail server sbase@icgeb.trieste.it, which now also provides a graphic representation of the homologies. A related mail server, domain@hubi.abc.hu, assigns SBASE domain homologies on the basis of SWISS-PROT searches.  相似文献   

3.
SBASE 2.0 is the second release of SBASE, a collection of annotated protein domain sequences. SBASE entries represent various structural, functional, ligand-binding and topogenic segments of proteins [Pongor, S. et al. (1993) Prot. Eng., in press]. This release contains 34,518 entries provided with standardized names and it is cross-referenced to the major protein and nucleic acid databanks as well as to the PROSITE catalog of protein sequence patterns [Bairoch, A. (1992) Nucl. Acids Res., 20 suppl, 2013-2018]. SBASE can be used for establishing domain homologies using different database-search tools such as FASTA [Lipman and Pearson (1985) Science, 227, 1436-1441], FASTDB [Brutlag et al. (1990) Comp. Appl. Biosci., 6, 237-245] or BLAST3 [Altschul and Lipman (1990) Proc. Natl. Acad. Sci. USA, 87, 5509-5513] which is especially useful in the case of loosely defined domain types for which efficient consensus patterns can not be established. SBASE 2.0 and a set of search and retrieval tools are freely available on request to the authors or by anonymous 'ftp' file transfer from mean value of ftp.icgeb.trieste.it.  相似文献   

4.
SBASE 7.0 is the seventh release of the SBASE protein domain library sequences that contains 237 937 annotated structural, functional, ligand-binding and topogenic segments of proteins, cross-referenced to all major sequence databases and sequence pattern collections. The entries are clustered into over 1811 groups and are provided with two WWW-based search facilities for on-line use. SBASE 7.0 is freely available by anonymous 'ftp' file transfer from ftp.icgeb. trieste.it. Automated searching of SBASE with BLAST can be carried out with the WWW servers http://www.icgeb.trieste.it/sbase/and http://sbase.abc.hu/sbase/  相似文献   

5.
SBASE 8.0 is the eighth release of the SBASE library of protein domain sequences that contains 294 898 annotated structural, functional, ligand-binding and topogenic segments of proteins, cross-referenced to most major sequence databases and sequence pattern collections. The entries are clustered into over 2005 statistically validated domain groups (SBASE-A) and 595 non-validated groups (SBASE-B), provided with several WWW-based search and browsing facilities for online use. A domain-search facility was developed, based on non-parametric pattern recognition methods, including artificial neural networks. SBASE 8.0 is freely available by anonymous 'ftp' file transfer from ftp.icgeb.trieste.it. Automated searching of SBASE can be carried out with the WWW servers http://www.icgeb.trieste.it/sbase/ and http://sbase.abc. hu/sbase/.  相似文献   

6.
SBASE (http://www.icgeb.trieste.it/sbase) is an on-line collection of protein domain sequences and related computational tools designed to facilitate detection of domain homologies based on simple database search. The 10th 'jubilee release' of the SBASE library of protein domain sequences contains 1 052 904 protein sequence segments annotated by structure, function, ligand-binding or cellular topology, clustered into over 6000 domain groups. Domain identification and functional prediction are based on a comparison of BLAST search outputs with a knowledge base of biologically significant similarities extracted from known domain groups. The knowledge base is generated automatically for each domain group from the comparison of within-group ('self') and out-of-group ('non-self') similarities. This is a memory-based approach wherein group-specific similarity functions are automatically learned from the database.  相似文献   

7.
RESULTS: A WWW server for protein domain homology prediction, based on BLAST search and a simple data-mining algorithm (Hegyi,H. and Pongor,S. (1993) Comput. Appl. Biosci., 9, 371-372), was constructed providing a tabulated list and a graphic plot of similarities. AVAILABILITY: http://www.icgeb.trieste.it/domain. Mirror site is available at http://sbase.abc.hu/domain. A standalone programme will be available on request. SUPPLEMENTARY INFORMATION: A series of help files is available at the above addresses.  相似文献   

8.
SUMMARY: A WWW server is described for creating 3D models of canonical or bent DNA starting from sequence data. Predicted DNA trajectory is first computed based on a choice of di- and tri-nucleotide models (M.G. Munteanu et al., Trends Biochem. Sci. 23, 341-347, 1998); an atomic model is then constructed and optionally energy-minimized with constrained molecular dynamics. The data are presented as a standard PDB file, directly viewable on the user's PC using any molecule manipulation program. AVAILABILITY: The model.it server is freely available at http://www.icgeb.trieste.it/dna/ CONTACT: kristian@icgeb.trieste.it; pongor@icgeb.trieste.it SUPPLEMENTARY INFORMATION: a series of help files is available at the above address.  相似文献   

9.
MOTIVATION: A key goal of genomics is to assign function to genes, especially for orphan sequences. RESULTS: We compared the clustered functional domains in the SBASE database to each protein sequence using BLASTP. This representation for a protein is a vector, where each of the non-zero entries in the vector indicates a significant match between the sequence of interest and the SBASE domain. The machine learning methods nearest neighbour algorithm (NNA) and support vector machines are used for predicting protein functional classes from this information. We find that the best results are found using the SBASE-A database and the NNA, namely 72% accuracy for 79% coverage. We tested an assigning function based on searching for InterPro sequence motifs and by taking the most significant BLAST match within the dataset. We applied the functional domain composition method to predict the functional class of 2018 currently unclassified yeast open reading frames. AVAILABILITY: A program for the prediction method, that uses NNA called Functional Class Prediction based on Functional Domains (FCPFD) is available and can be obtained by contacting Y.D.Cai at y.cai@umist.ac.uk  相似文献   

10.
SUMMARY: A web server has been established for the statistical evaluation of introns in various taxonomic groups and the comparison of taxonomic groups in terms of intron type, length, base composition, etc. The options include the graphic analysis of splice sites and a probability test for exon-shuffling within the selected group. AVAILABILITY: introns.abc.hu, http://www.icgeb.trieste.it/introns  相似文献   

11.
The ProDom database of protein domain families.   总被引:12,自引:1,他引:11       下载免费PDF全文
F Corpet  J Gouzy    D Kahn 《Nucleic acids research》1998,26(1):323-326
The ProDom database contains protein domain families generated from the SWISS-PROT database by automated sequence comparisons. It can be searched on the World Wide Web (http://protein.toulouse.inra. fr/prodom.html ) or by E-mail (prodom@toulouse.inra.fr) to study domain arrangements within known families or new proteins. Strong emphasis has been put on the graphical user interface which allows for interactive analysis of protein homology relationships. Recent improvements to the server include: ProDom search by keyword; links to PROSITE and PDB entries; more sensitive ProDom similarity search with BLAST or WU-BLAST; alignments of query sequences with homologous ProDom domain families; and links to the SWISS-MODEL server (http: //www.expasy.ch/swissmod/SWISS-MODEL.html ) for homology based 3-D domain modelling where possible.  相似文献   

12.
The WWW servers at http://www.icgeb.trieste.it/dna/ are dedicated to the analysis of user-submitted DNA sequences; plot.it creates parametric plots of 45 physicochemical, as well as statistical, parameters; bend.it calculates DNA curvature according to various methods. Both programs provide 1D as well as 2D plots that allow localisation of peculiar segments within the query. The server model.it creates 3D models of canonical or bent DNA starting from sequence data and presents the results in the form of a standard PDB file, directly viewable on the user's PC using any molecule manipulation program. The recently established introns server allows statistical evaluation of introns in various taxonomic groups and the comparison of taxonomic groups in terms of length, base composition, intron type etc. The options include the analysis of splice sites and a probability test for exon-shuffling.  相似文献   

13.
PISCES: a protein sequence culling server   总被引:21,自引:0,他引:21  
PISCES is a public server for culling sets of protein sequences from the Protein Data Bank (PDB) by sequence identity and structural quality criteria. PISCES can provide lists culled from the entire PDB or from lists of PDB entries or chains provided by the user. The sequence identities are obtained from PSI-BLAST alignments with position-specific substitution matrices derived from the non-redundant protein sequence database. PISCES therefore provides better lists than servers that use BLAST, which is unable to identify many relationships below 40% sequence identity and often overestimates sequence identity by aligning only well-conserved fragments. PDB sequences are updated weekly. PISCES can also cull non-PDB sequences provided by the user as a list of GenBank identifiers, a FASTA format file, or BLAST/PSI-BLAST output.  相似文献   

14.
15.
Development and testing of protein classification algorithms are hampered by the fact that the protein universe is characterized by groups vastly different in the number of members, in average protein size, similarity within group, etc. Datasets based on traditional cross-validation (k-fold, leave-one-out, etc.) may not give reliable estimates on how an algorithm will generalize to novel, distantly related subtypes of the known protein classes. Supervised cross-validation, i.e., selection of test and train sets according to the known subtypes within a database has been successfully used earlier in conjunction with the SCOP database. Our goal was to extend this principle to other databases and to design standardized benchmark datasets for protein classification. Hierarchical classification trees of protein categories provide a simple and general framework for designing supervised cross-validation strategies for protein classification. Benchmark datasets can be designed at various levels of the concept hierarchy using a simple graph-theoretic distance. A combination of supervised and random sampling was selected to construct reduced size model datasets, suitable for algorithm comparison. Over 3000 new classification tasks were added to our recently established protein classification benchmark collection that currently includes protein sequence (including protein domains and entire proteins), protein structure and reading frame DNA sequence data. We carried out an extensive evaluation based on various machine-learning algorithms such as nearest neighbor, support vector machines, artificial neural networks, random forests and logistic regression, used in conjunction with comparison algorithms, BLAST, Smith-Waterman, Needleman-Wunsch, as well as 3D comparison methods DALI and PRIDE. The resulting datasets provide lower, and in our opinion more realistic estimates of the classifier performance than do random cross-validation schemes. A combination of supervised and random sampling was used to construct model datasets, suitable for algorithm comparison.

The datasets are available at http://hydra.icgeb.trieste.it/benchmark.  相似文献   


16.
Vlahovicek K  Munteanu MG  Pongor S 《Genetica》1999,106(1-2):63-73
Bending is a local conformational micropolymorphism of DNA in which the original B-DNA structure is only distorted but not extensively modified. Bending can be predicted by simple static geometry models as well as by a recently developed elastic model that incorporate sequence dependent anisotropic bendability (SDAB). The SDAB model qualitatively explains phenomena including affinity of protein binding, kinking, as well as sequence-dependent vibrational properties of DNA. The vibrational properties of DNA segments can be studied by finite element analysis of a model subjected to an initial bending moment. The frequency spectrum is obtained by applying Fourier analysis to the displacement values in the time domain. This analysis shows that the spectrum of the bending vibrations quite sensitively depends on the sequence, for example the spectrum of a curved sequence is characteristically different from the spectrum of straight sequence motifs of identical basepair composition. Curvature distributions are genome-specific, and pronounced differences are found between protein-coding and regulatory regions, respectively, that is, sites of extreme curvature and/or bendability are less frequent in protein-coding regions. A WWW server is set up for the prediction of curvature and generation of 3D models from DNA sequences (http://www.icgeb.trieste.it/dna).This revised version was published online in October 2005 with corrections to the Cover Date.  相似文献   

17.

Background

Alveolar echinococcosis (AE) is a severe chronic hepatic parasitic disease currently emerging in central and eastern Europe. Untreated AE presents a high mortality (>90%) due to a severe hepatic destruction as a result of parasitic metacestode proliferation which behaves like a malignant tumor. Despite this severe course and outcome of disease, the genetic program that regulates the host response leading to organ damage as a consequence of hepatic alveolar echinococcosis is largely unknown.

Methodology/Principal Findings

We used a mouse model of AE to assess gene expression profiles in the liver after establishment of a chronic disease status as a result of a primary peroral infection with eggs of the fox tapeworm Echinococcus multilocularis. Among 38 genes differentially regulated (false discovery rate adjusted p≤0.05), 35 genes were assigned to the functional gene ontology group <immune response>, while 3 associated with the functional group <intermediary metabolism>. Upregulated genes associated with <immune response> could be clustered into functional subgroups including <macrophages>, <APCs>, <lymphocytes, chemokines and regulation>, <B-cells> and <eosinophils>. Two downregulated genes related to <lymphocytes, chemokines and regulation> and <intermediary metabolism>, respectively. The <immune response> genes either associated with an <immunosupression> or an <immunostimulation> pathway. From the overexpressed genes, 18 genes were subsequently processed with a Custom Array microfluidic card system in order to assess respective expression status at the mRNA level relative to 5 reference genes (Gapdh, Est1, Rlp3, Mdh-1, Rpl37) selected upon a constitutive and stable expression level. The results generated by the two independent tools used for the assessment of gene expression, i.e., microarray and microfluidic card system, exhibited a high level of congruency (Spearman correlation rho = 0.81, p = 7.87e-5) and thus validated the applied methods.

Conclusions/Significance

Based on this set of biomarkers, new diagnostic targets have been made available to predict disease status and progression. These biomarkers may also offer new targets for immuno-therapeutic intervention.  相似文献   

18.
Qualitative and quantitative assessment of heavy metals in the Thermal Power Plant effluent was performed to study the impact of their toxic effects on various biomarkers (carbohydrate, protein and lipid profiles). Heavy metals present in the water were in the order Fe > Cu > Zn > Mn > Ni > Co > Cr. Fe and Ni exceeded and Cr was equal to the USA standards set by UNEPGEMS. Glycogen in liver (p < 0.001) and muscle (p < 0.01) depleted significantly. Insignificant (p < 0.05) decline in blood glucose (−21.0%) and significant (p < 0.05) elevation in both total protein and globulin in serum, liver and muscle was noted. Albumin decreased significantly (p < 0.01) in serum but showed significant (p < 0.05) increase in liver and muscle. Thus A:G ratio fell in serum and rose in liver and muscle. Similarly lipid profile also gets altered where significant elevation in serum total lipid (p < 0.01), total cholesterol (p < 0.01), phospholipid (p < 0.05), triglycerides (p < 0.001), LDL (p < 0.01) was observed but significant (p < 0.05) decline in VLDL was recorded. These biomarkers suggested that fish become hypoglycemic, hyperlipidemic and hypercholesterolemic. Heavy metals also provoked immune response as evident from the rise in globulin. In conclusion the Thermal Power Plant wastewater containing heavy metals induced stress, making fish weak and vulnerable to diseases.  相似文献   

19.
As a fundamental characteristic of soil physical properties, the soil Particle Size Distribution (PSD) is important in the research on soil moisture migration, solution transformation, and soil erosion. In this research, the PSD characteristics with distinct methods in different land uses are analyzed. The results show that the upper bound of the volume domain of the clay domain ranges from 5.743μm to 5.749μm for all land-use types. For the silt domain of purple soil, the value ranges among 286.852~286.966 μm. For all purple soil land-use types, the order of the volume domain fractal dimensions is Dclay<Dsilt<Dsand. However, the values of Dsilt and Dsand in the Pinus massoniana Lamb, Robinia pseudoacacia L and Ipomoea batatas are all higher than the corresponding values in the Citrus reticulate Blanco and Setaria viridis. Moreover, in all the land-use types, all of the parameters in volume domain fractal dimension (Dvi) are higher than the corresponding parameter values from the United States Department of Agriculture (Dvi(U)). The correlation study between the volume domain fractal dimension and the soil properties shows that the intensity of correlation to the soil texture and soil organic matter has the order as: Dsilt>Dsilt(U)>Dsand (U)>Dsand and Dsilt>Dsilt(U)>Dsand>Dsand(U), respectively. As it is compared with all Dvi, the Dsilt has the most significant correlativity to the soil texture and organic matter in different land uses of the typical purple soil watersheds. Therefore, Dsilt will be a potential indictor for evaluating the proportion of fine particles in the PSD, as well as a key measurement in soil quality and productivity studies.  相似文献   

20.
SARS-CoV-2 is the novel coronavirus that is the causative agent of COVID-19, a sometimes-lethal respiratory infection responsible for a world-wide pandemic. The envelope (E) protein, one of four structural proteins encoded in the viral genome, is a 75-residue integral membrane protein whose transmembrane domain exhibits ion channel activity and whose cytoplasmic domain participates in protein-protein interactions. These activities contribute to several aspects of the viral replication-cycle, including virion assembly, budding, release, and pathogenesis. Here, we describe the structure and dynamics of full-length SARS-CoV-2 E protein in hexadecylphosphocholine micelles by NMR spectroscopy. We also characterized its interactions with four putative ion channel inhibitors. The chemical shift index and dipolar wave plots establish that E protein consists of a long transmembrane helix (residues 8–43) and a short cytoplasmic helix (residues 53–60) connected by a complex linker that exhibits some internal mobility. The conformations of the N-terminal transmembrane domain and the C-terminal cytoplasmic domain are unaffected by truncation from the intact protein. The chemical shift perturbations of E protein spectra induced by the addition of the inhibitors demonstrate that the N-terminal region (residues 6–18) is the principal binding site. The binding affinity of the inhibitors to E protein in micelles correlates with their antiviral potency in Vero E6 cells: HMA ≈ EIPA > DMA >> Amiloride, suggesting that bulky hydrophobic groups in the 5’ position of the amiloride pyrazine ring play essential roles in binding to E protein and in antiviral activity. An N15A mutation increased the production of virus-like particles, induced significant chemical shift changes from residues in the inhibitor binding site, and abolished HMA binding, suggesting that Asn15 plays a key role in maintaining the protein conformation near the binding site. These studies provide the foundation for complete structure determination of E protein and for structure-based drug discovery targeting this protein.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号