首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A method for flexible fitting of molecular models into three-dimensional electron microscopy (3D-EM) reconstructions at a resolution range of 8-12 A is proposed. The approach uses the evolutionarily related structural variability existing among the protein domains of a given superfamily, according to structural databases such as CATH. A structural alignment of domains belonging to the superfamily, followed by a principal components analysis, is performed, and the first three principal components of the decomposition are explored. Using rigid body transformations for the secondary structure elements (SSEs) plus the cyclic coordinate descent algorithm to close the loops, stereochemically correct models are built for the structure to fit. All of the models are fitted into the 3D-EM map, and the best one is selected based on crosscorrelation measures. This work applies the method to both simulated and experimental data and shows that the flexible fitting was able to produce better results than rigid body fitting.  相似文献   

2.
In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in.  相似文献   

3.
The concept of a flexible protein sequence pattern is defined. In contrast to conventional pattern matching, template or sequence alignment methods, flexible patterns allow residue patterns typical of a complete protein fold to be developed in terms of residue positions (elements), separated by gaps of defined range. An efficient dynamic programming algorithm is presented to enable the best alignment(s) of a pattern with a sequence to be identified. The flexible pattern method is evaluated in detail by reference to the globin protein family, and by comparison to alignment techniques that exploit single sequence, multiple sequence and secondary structural information. A flexible pattern derived from seven globins aligned on structural criteria successfully discriminates all 345 globins from non-globins in the Protein Identification Resource database. Furthermore, a pattern that uses helical regions from just human alpha-haemoglobin identified 337 globins compared to 318 for the best non-pattern global alignment method. Patterns derived from successively fewer, yet more highly conserved positions in a structural alignment of seven globins show that as few as 38 residue positions (25 buried hydrophobic, 4 exposed and 9 others) may be used to uniquely identify the globin fold. The study suggests that flexible patterns gain discriminating power both by discarding regions known to vary within the protein family, and by defining gaps within specific ranges. Flexible patterns therefore provide a convenient and powerful bridge between regular expression pattern matching techniques and more conventional local and global sequence comparison algorithms.  相似文献   

4.
In this paper, based on low-rank representation and eigenface extraction, we present an improvement to the well known Sparse Representation based Classification (SRC). Firstly, the low-rank images of the face images of each individual in training subset are extracted by the Robust Principal Component Analysis (Robust PCA) to alleviate the influence of noises (e.g., illumination difference and occlusions). Secondly, Singular Value Decomposition (SVD) is applied to extract the eigenfaces from these low-rank and approximate images. Finally, we utilize these eigenfaces to construct a compact and discriminative dictionary for sparse representation. We evaluate our method on five popular databases. Experimental results demonstrate the effectiveness and robustness of our method.  相似文献   

5.
目的:磁共振波谱成像有7类快速方法,它们都来自快速MRI成像方法。本文提出的奇异值分解波谱成像不同这7类快速方法,它是把MRI中任意轨迹图像重建方法用于波谱成像,这将有利于设计出速度更快的波谱成像数据采集脉冲序列。  相似文献   

6.
Pectoral fins fascinate researchers for their important role in fish maneuvers. By possessing a complicated flexible structure with several fin rays made by a thin film, the fin exhibits a three-dimensional (3D) motion. The complex 3D fin kinematics makes it challenging to study the performance of pectoral fin. Nevertheless, a detailed study on the 3D motion pattern of pectoral fins is necessary to the design and control of a bio-inspired fin rays. Therefore, a highspeed photography system is introduced in this paper to study the 3D motion of a Koi Carp by analyzing the two views of its pectoral fin simultaneously. The key motions of the pectoral fins are first captured in both hovering and retreating. Next, the 3D configuration of the pectoral fins is reconstructed by digital image processing, in which the movement of fin rays during fish retreating and hovering is obtained. Furthermore, the method of Singular Value Decomposition (SVD) is adopted to extract the basic motion patterns of pectoral fins from extensive image sequences, i.e. expansion, bending, cupping, and undulation. It is believed that the movement of the fin rays and the basic patterns of the pectoral fins obtained in the present work can provide a good foundation for the development and control of bionic flexible pectoral fins for underwater propeller.  相似文献   

7.
DNA microarray gene expression and microarray-based comparative genomic hybridization (aCGH) have been widely used for biomedical discovery. Because of the large number of genes and the complex nature of biological networks, various analysis methods have been proposed. One such method is "gene shaving," a procedure which identifies subsets of the genes with coherent expression patterns and large variation across samples. Since combining genomic information from multiple sources can improve classification and prediction of diseases, in this paper we proposed a new method, "ICA gene shaving" (ICA, independent component analysis), for jointly analyzing gene expression and copy number data. First we used ICA to analyze joint measurements, gene expression and copy number, of a biological system and project the data onto statistically independent biological processes. Next, we used these results to identify patterns of variation in the data and then applied an iterative shaving method. We investigated the properties of our proposed method by analyzing both simulated and real data. We demonstrated that the robustness of our method to noise using simulated data. Using breast cancer data, we showed that our method is superior to the Generalized Singular Value Decomposition (GSVD) gene shaving method for identifying genes associated with breast cancer.  相似文献   

8.
Linear discrimination, from the point of view of numerical linear algebra, can be treated as solving an ill-posed system of linear equations. In order to generate a solution that is robust in the presence of noise, these problems require regularization. Here, we examine the ill-posedness involved in the linear discrimination of cancer gene expression data with respect to outcome and tumor subclasses. We show that a filter factor representation, based upon Singular Value Decomposition, yields insight into the numerical ill-posedness of the hyperplane-based separation when applied to gene expression data. We also show that this representation yields useful diagnostic tools for guiding the selection of classifier parameters, thus leading to improved performance.  相似文献   

9.
针对利用机器视觉对进行水果分级时,由于水果运动所造成的模糊问题,提出了基于矩阵广义逆和奇异值分解的恢复方法,实验表明恢复的图像比较清晰,并且在保证实时的条件下可将水果大小检测的相对误差从4.17%减小为0.671%,相比于传统的恢复方法而言,提高了速度,消除了误差积累,为后续的边缘检测、形状分析、缺陷分类等打下了基础.  相似文献   

10.

Background  

Feature selection is an important pre-processing task in the analysis of complex data. Selecting an appropriate subset of features can improve classification or clustering and lead to better understanding of the data. An important example is that of finding an informative group of genes out of thousands that appear in gene-expression analysis. Numerous supervised methods have been suggested but only a few unsupervised ones exist. Unsupervised Feature Filtering (UFF) is such a method, based on an entropy measure of Singular Value Decomposition (SVD), ranking features and selecting a group of preferred ones.  相似文献   

11.
Although three-dimensional electron microscopy (3D-EM) permits structural characterization of macromolecular assemblies in distinct functional states, the inability to classify projections from structurally heterogeneous samples has severely limited its application. We present a maximum likelihood-based classification method that does not depend on prior knowledge about the structural variability, and demonstrate its effectiveness for two macromolecular assemblies with different types of conformational variability: the Escherichia coli ribosome and Simian virus 40 (SV40) large T-antigen.  相似文献   

12.
13.
Laser-scanning methods are a means to observe streaming particles, such as the flow of red blood cells in a blood vessel. Typically, particle velocity is extracted from images formed from cyclically repeated line-scan data that is obtained along the center-line of the vessel; motion leads to streaks whose angle is a function of the velocity. Past methods made use of shearing or rotation of the images and a Singular Value Decomposition (SVD) to automatically estimate the average velocity in a temporal window of data. Here we present an alternative method that makes use of the Radon transform to calculate the velocity of streaming particles. We show that this method is over an order of magnitude faster than the SVD-based algorithm and is more robust to noise.  相似文献   

14.
Vorolign, a fast and flexible structural alignment method for two or more protein structures is introduced. The method aligns protein structures using double dynamic programming and measures the similarity of two residues based on the evolutionary conservation of their corresponding Voronoi-contacts in the protein structure. This similarity function allows aligning protein structures even in cases where structural flexibilities exist. Multiple structural alignments are generated from a set of pairwise alignments using a consistency-based, progressive multiple alignment strategy. RESULTS: The performance of Vorolign is evaluated for different applications of protein structure comparison, including automatic family detection as well as pairwise and multiple structure alignment. Vorolign accurately detects the correct family, superfamily or fold of a protein with respect to the SCOP classification on a set of difficult target structures. A scan against a database of >4000 proteins takes on average 1 min per target. The performance of Vorolign in calculating pairwise and multiple alignments is found to be comparable with other pairwise and multiple protein structure alignment methods. AVAILABILITY: Vorolign is freely available for academic users as a web server at http://www.bio.ifi.lmu.de/Vorolign  相似文献   

15.
Cryo-electron microscopy (cryo-EM) has been widely used to explore conformational states of large biomolecular assemblies. The detailed interpretation of cryo-EM data requires the flexible fitting of a known high-resolution protein structure into a low-resolution cryo-EM map. To this end, we have developed what we believe is a new method based on a two-bead-per-residue protein representation, and a modified form of the elastic network model that allows large-scale conformational changes while maintaining pseudobonds and secondary structures. Our method minimizes a pseudo-energy which linearly combines various terms of the modified elastic network model energy with a cryo-EM-fitting score and a collision energy that penalizes steric collisions. Unlike previous flexible fitting efforts using the lowest few normal modes, our method effectively utilizes all normal modes so that both global and local structural changes can be fully modeled. We have validated our method for a diverse set of 10 pairs of protein structures using simulated cryo-EM maps with a range of resolutions and in the absence/presence of random noise. We have shown that our method is both accurate and efficient compared with alternative techniques, and its performance is robust to the addition of random noise. Our method is also shown to be useful for the flexible fitting of three experimental cryo-EM maps.  相似文献   

16.
The increase of daily released bioinformatic data has generated new ways of organising and disseminating information. Specifically, in the field of sequence data, many efforts have been made not only to store information in databases, but also to annotate it and then share these annotations through a standard XML (eXtensible Markup Language) protocol and appropriate integration clients. This is the context in which the Distributed Annotation System (DAS) has emerged in genomics. Additionally, initiatives in the field of structural data, such as the extension of DAS to atomic resolution data, which generated the SPICE client, have also occurred. This paper presents 3D-EM DAS, a further extension of the DAS protocol that allows sharing annotations about hybrid models. This annotation system has been built on the basis of the EMDB, which stores Three-dimensional Electron Microscopy (3D-EM) volumes, PDB, which houses atomic coordinates, and UniProt (for protein sequences) databases. In this way, annotations for sequences, atomic coordinates, and 3D-EM volumes are collected and displayed through a single graphical visualization client. Thus, users have an integrated view of all the annotations together with the whole macromolecule (3D-EM map coming from EMDB), the atomic resolution structures fitted into it (coordinates coming from PDB) and the sequences corresponding to each of the structures (from UniProt).  相似文献   

17.
The current K-string-based protein sequence comparisons require large amounts of computer memory because the dimension of the protein vector representation grows exponentially with K. In this paper, we propose a novel concept, the “K-string dictionary”, to solve this high-dimensional problem. It allows us to use a much lower dimensional K-string-based frequency or probability vector to represent a protein, and thus significantly reduce the computer memory requirements for their implementation. Furthermore, based on this new concept, we use Singular Value Decomposition to analyze real protein datasets, and the improved protein vector representation allows us to obtain accurate gene trees.  相似文献   

18.
We present computational solutions to two problemsof macromolecular structure interpretation from reconstructedthree-dimensional electron microscopy (3D-EM) maps of largebio-molecular complexes at intermediate resolution (5A-15A). Thetwo problems addressed are: (a) 3D structural alignment (matching)between identified and segmented 3D maps of structure units(e.g. trimeric configuration of proteins), and (b) the secondarystructure identification of a segmented protein 3D map (i.e.locations of a-helices, b -sheets). For problem (a), we presentan efficient algorithm to correlate spatially (and structurally)two 3D maps of structure units. Besides providing a similarityscore between structure units, the algorithm yields an effectivetechnique for resolution refinement of repeated structure units,by 3D alignment and averaging. For problem (b), we present anefficient algorithm to compute eigenvalues and link eigenvectorsof a Gaussian convoluted structure tensor derived from theprotein 3D Map, thereby identifying and locating secondarystructural motifs of proteins. The efficiency and performanceof our approach is demonstrated on several experimentallyreconstructed 3D maps of virus capsid shells from single-particlecryo-EM, as well as computationally simulated protein structuredensity 3D maps generated from protein model entries in theProtein Data Bank.  相似文献   

19.
FlexProt is a novel technique for the alignment of flexible proteins. Unlike all previous algorithms designed to solve the problem of structural comparisons allowing hinge-bending motions, FlexProt does not require an a priori knowledge of the location of the hinge(s). FlexProt carries out the flexible alignment, superimposing the matching rigid subpart pairs, and detects the flexible hinge regions simultaneously. A large number of methods are available to handle rigid structural alignment. However, proteins are flexible molecules, which may appear in different conformations. Hence, protein structural analysis requires algorithms that can deal with molecular flexibility. Here, we present a method addressing specifically a flexible protein alignment task. First, the method efficiently detects maximal congruent rigid fragments in both molecules. Transforming the task into a graph theoretic problem, our method proceeds to calculate the optimal arrangement of previously detected maximal congruent rigid fragments. The fragment arrangement does not violate the protein sequence order. A clustering procedure is performed on fragment-pairs which have the same 3-D rigid transformation regardless of insertions and deletions (such as loops and turns) which separate them. Although the theoretical worst case complexity of the algorithm is O(n(6)), in practice FlexProt is highly efficient. It performs a structural comparison of a pair of proteins 300 amino acids long in about seven seconds on a standard desktop PC (400 MHz Pentium II processor with 256MB internal memory). We have performed extensive experiments with the algorithm. An assortment of these results is presented here. FlexProt can be accessed via WWW at bioinfo3d.cs.tau.ac.il/FlexProt/.  相似文献   

20.
Time-Of-Flight Mass Spectrometry (TOF-SIMS) was used to determine elemental and biomolecular ions from isolated protein samples. We identified a set of 23 mass-to-charge ratio (m/z) peaks that represent signatures for distinguishing biological samples. The 23 peaks were identified by Singular Value Decomposition (SVD) and Canonical Analysis (CA) to find the underlying structure in the complex mass-spectra data sets. From this modified data, SVD was used to identify sets of m/z peaks, and we used these patterns from the TOF-SIMS data to predict the biological source from which individual mass spectra were generated. The signatures were validated using an additional data set different from the initial training set used to identify the signatures. We present a simple method to identify multiple variables required for sample classification based on mass spectra that avoids overfit. This is important in a variety of studies using mass spectrometry, including the ability to identify proteins in complex mixtures and for the identification of new biomarkers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号