首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 9 毫秒
1.
The level of conservation between two homologous sequences often varies among sequence regions; functionally important domains are more conserved than the remaining regions. Thus, multiple parameter sets should be used in alignment of homologous sequences with a stringent parameter set for highly conserved regions and a moderate parameter set for weakly conserved regions. We describe an alignment algorithm to allow dynamic use of multiple parameter sets with different levels of stringency in computation of an optimal alignment of two sequences. The algorithm dynamically considers various candidate alignments, partitions each candidate alignment into sections, and determines the most appropriate set of parameter values for each section of the alignment. The algorithm and its local alignment version are implemented in a computer program named GAP4. The local alignment algorithm in GAP4, that in its predecessor GAP3, and an ordinary local alignment program SIM were evaluated on 257716 pairs of homologous sequences from 100 protein families. On 168475 of the 257716 pairs (a rate of 65.4%), alignments from GAP4 were more statistically significant than alignments from GAP3 and SIM.  相似文献   

2.
Identification of homogeneous subsets of images in a macromolecular electron microscopy (EM) image data set is a critical step in single-particle analysis. The task is handled by iterative algorithms, whose performance is compromised by the compounded limitations of image alignment and K-means clustering. Here we describe an approach, iterative stable alignment and clustering (ISAC) that, relying on a new clustering method and on the concepts of stability and reproducibility, can extract validated, homogeneous subsets of images. ISAC requires only a small number of simple parameters and, with minimal human intervention, can eliminate bias from two-dimensional image clustering and maximize the quality of group averages that can be used for ab initio three-dimensional structural determination and analysis of macromolecular conformational variability. Repeated testing of the stability and reproducibility of a solution within ISAC eliminates heterogeneous or incorrect classes and introduces critical validation to the process of EM image clustering.  相似文献   

3.
Landan G  Graur D 《Gene》2009,441(1-2):141-147
We characterize pairwise and multiple sequence alignment (MSA) errors by comparing true alignments from simulations of sequence evolution with reconstructed alignments. The vast majority of reconstructed alignments contain many errors. Error rates rapidly increase with sequence divergence, thus, for even intermediate degrees of sequence divergence, more than half of the columns of a reconstructed alignment may be expected to be erroneous. In closely related sequences, most errors consist of the erroneous positioning of a single indel event and their effect is local. As sequences diverge, errors become more complex as a result of the simultaneous mis-reconstruction of many indel events, and the lengths of the affected MSA segments increase dramatically. We found a systematic bias towards underestimation of the number of gaps, which leads to the reconstructed MSA being on average shorter than the true one. Alignment errors are unavoidable even when the evolutionary parameters are known in advance. Correct reconstruction can only be guaranteed when the likelihood of true alignment is uniquely optimal. However, true alignment features are very frequently sub-optimal or co-optimal, with the result that optimal albeit erroneous features are incorporated into the reconstructed MSA. Progressive MSA utilizes a guide-tree in the reconstruction of MSAs. The quality of the guide-tree was found to affect MSA error levels only marginally.  相似文献   

4.
Dickson RJ  Gloor GB 《PloS one》2012,7(6):e37645
The use of sequence alignments to understand protein families is ubiquitous in molecular biology. High quality alignments are difficult to build and protein alignment remains one of the largest open problems in computational biology. Misalignments can lead to inferential errors about protein structure, folding, function, phylogeny, and residue importance. Identifying alignment errors is difficult because alignments are built and validated on the same primary criteria: sequence conservation. Local covariation identifies systematic misalignments and is independent of conservation. We demonstrate an alignment curation tool, LoCo, that integrates local covariation scores with the Jalview alignment editor. Using LoCo, we illustrate how local covariation is capable of identifying alignment errors due to the reduction of positional independence in the region of misalignment. We highlight three alignments from the benchmark database, BAliBASE 3, that contain regions of high local covariation, and investigate the causes to illustrate these types of scenarios. Two alignments contain sequential and structural shifts that cause elevated local covariation. Realignment of these misaligned segments reduces local covariation; these alternative alignments are supported with structural evidence. We also show that local covariation identifies active site residues in a validated alignment of paralogous structures. Loco is available at https://sourceforge.net/projects/locoprotein/files/.  相似文献   

5.
Chen H  Kihara D 《Proteins》2008,71(3):1255-1274
The error in protein tertiary structure prediction is unavoidable, but it is not explicitly shown in most of the current prediction algorithms. Estimated error of a predicted structure is crucial information for experimental biologists to use the prediction model for design and interpretation of experiments. Here, we propose a method to estimate errors in predicted structures based on the stability of the optimal target-template alignment when compared with a set of suboptimal alignments. The stability of the optimal alignment is quantified by an index named the SuboPtimal Alignment Diversity (SPAD). We implemented SPAD in a profile-based threading algorithm and investigated how well SPAD can indicate errors in threading models using a large benchmark dataset of 5232 alignments. SPAD shows a very good correlation not only to alignment shift errors but also structure-level errors, the root mean square deviation (RMSD) of predicted structure models to the native structures (i.e. global errors), and local errors at each residue position. We have further compared SPAD with seven other quality measures, six from sequence alignment-based measures and one atomic statistical potential, discrete optimized protein energy (DOPE), in terms of the correlation coefficient to the global and local structure-level errors. In terms of the correlation to the RMSD of structure models, when a target and a template are in the same SCOP family, the sequence identity showed a best correlation to the RMSD; in the superfamily level, SPAD was the best; and in the fold level, DOPE was best. However, in a head-to-head comparison, SPAD wins over the other measures. Next, SPAD is compared with three other measures of local errors. In this comparison, SPAD was best in all of the family, the superfamily and the fold levels. Using the discovered correlation, we have also predicted the global and local error of our predicted structures of CASP7 targets by the SPAD. Finally, we proposed a sausage representation of predicted tertiary structures which intuitively indicate the predicted structure and the estimated error range of the structure simultaneously.  相似文献   

6.
We present an image segmentation algorithm for small intestinal glands consisting of goblet cells that are evenly distributed and arranged in parallel at the base. Making use of the properties of the chain distribution of the goblet cells, directional 2-dimensional (2-D) linear filters with different orientations were designed to enhance the rims of the intestinal glands. Segmentations are based on the combined responses of the multiple zero-phase directional 2-D linear filters. For comparisons, outputs of combined directional filters are shown along with those of the comparable nondirectional Gaussian filters. Segmentation results of small intestinal glands of both normal and cancer cases are provided.  相似文献   

7.
  1. When we collect the growth curves of many individuals, orderly variation in the curves is often observed rather than a completely random mixture of various curves. Small individuals may exhibit similar growth curves, but the curves differ from those of large individuals, whereby the curves gradually vary from small to large individuals. It has been recognized that after standardization with the asymptotes, if all the growth curves are the same (anamorphic growth curve set), the growth curve sets can be estimated using nonchronological data; otherwise, that is, if the growth curves are not identical after standardization with the asymptotes (polymorphic growth curve set), this estimation is not feasible. However, because a given set of growth curves determines the variation in the observed data, it may be possible to estimate polymorphic growth curve sets using nonchronological data.
  2. In this study, we developed an estimation method by deriving the likelihood function for polymorphic growth curve sets. The method involves simple maximum likelihood estimation. The weighted nonlinear regression and least‐squares method after the log‐transform of the anamorphic growth curve sets were included as special cases.
  3. The growth curve sets of the height of cypress (Chamaecyparis obtusa) and larch (Larix kaempferi) trees were estimated. With the model selection process using the AIC and likelihood ratio test, the growth curve set for cypress was found to be polymorphic, whereas that for larch was found to be anamorphic. Improved fitting using the polymorphic model for cypress is due to resolving underdispersion (less dispersion in real data than model prediction).
  4. The likelihood function for model estimation depends not only on the distribution type of asymptotes, but the definition of the growth curve set as well. Consideration of these factors may be necessary, even if environmental explanatory variables and random effects are introduced.
  相似文献   

8.
In many biomedical applications, it is desirable to estimate the three-dimensional (3D) position and orientation (pose) of a metallic rigid object (such as a knee or hip implant) from its projection in a two-dimensional (2D) X-ray image. If the geometry of the object is known, as well as the details of the image formation process, then the pose of the object with respect to the sensor can be determined. A common method for 3D-to-2D registration is to first segment the silhouette contour from the X-ray image; that is, identify all points in the image that belong to the 2D silhouette and not to the background. This segmentation step is then followed by a search for the 3D pose that will best match the observed contour with a predicted contour. Although the silhouette of a metallic object is often clearly visible in an X-ray image, adjacent tissue and occlusions can make the exact location of the silhouette contour difficult to determine in places. Occlusion can occur when another object (such as another implant component) partially blocks the view of the object of interest. In this paper, we argue that common methods for segmentation can produce errors in the location of the 2D contour, and hence errors in the resulting 3D estimate of the pose. We show, on a typical fluoroscopy image of a knee implant component, that interactive and automatic methods for segmentation result in segmented contours that vary significantly. We show how the variability in the 2D contours (quantified by two different metrics) corresponds to variability in the 3D poses. Finally, we illustrate how traditional segmentation methods can fail completely in the (not uncommon) cases of images with occlusion.  相似文献   

9.
10.
Soft X-ray tomography (SXT) is a powerful imaging technique that generates quantitative, 3D images of the structural organization of whole cells in a near-native state. SXT is also a high-throughput imaging technique. At the National Center for X-ray Tomography (NCXT), specimen preparation and image collection for tomographic reconstruction of a whole cell require only minutes. Aligning and reconstructing the data, however, take significantly longer. Here we describe a new component of the high throughput computational pipeline used for processing data at the NCXT. We have developed a new method for automatic alignment of projection images that does not require fiducial markers or manual interaction with the software. This method has been optimized for SXT data sets, which routinely involve full rotation of the specimen. This software gives users of the NCXT SXT instrument a new capability - virtually real-time initial 3D results during an imaging experiment, which can later be further refined. The new code, Automatic Reconstruction 3D (AREC3D), is also fast, reliable, and robust. The fundamental architecture of the code is also adaptable to high performance GPU processing, which enables significant improvements in speed and fidelity.  相似文献   

11.
With the availability of two-dimensional (2-D) gel electrophoresis databases that have many characterized proteins, it may be possible to compare a researcher’s gel images with those in relevant databases. This may lead to the putative identification of unknown protein spots in a researcher’s gel with those characterized in a given database, saving the researcher time and money by suggesting monoclonal antibodies to try in confirming these identifications. We have developed two tools to help with this comparison: (1) Flicker, http://www.lecb.ncifcrf.gov/flicker/, a Java applet program running in the researcher’s Web browser, to visually compare their gels against gels on the Internet; and (2) the 2DWG meta-database, http://www.lecb.ncifcrf.gov/2dwgDB/, a searchable database of locations of 2-D electrophoretic gel images found on the Internet. Recent additions to Flicker allow users to click on a protein spot in a gel that is linked to a federated 2D gel database, such as SWISS-2DPAGE, and have it retrieve a report from that Web database for that protein.  相似文献   

12.
In non‐model organisms, evolutionary questions are frequently addressed using reduced representation sequencing techniques due to their low cost, ease of use, and because they do not require genomic resources such as a reference genome. However, evidence is accumulating that such techniques may be affected by specific biases, questioning the accuracy of obtained genotypes, and as a consequence, their usefulness in evolutionary studies. Here, we introduce three strategies to estimate genotyping error rates from such data: through the comparison to high quality genotypes obtained with a different technique, from individual replicates, or from a population sample when assuming Hardy‐Weinberg equilibrium. Applying these strategies to data obtained with Restriction site Associated DNA sequencing (RAD‐seq), arguably the most popular reduced representation sequencing technique, revealed per‐allele genotyping error rates that were much higher than sequencing error rates, particularly at heterozygous sites that were wrongly inferred as homozygous. As we exemplify through the inference of genome‐wide and local ancestry of well characterized hybrids of two Eurasian poplar (Populus) species, such high error rates may lead to wrong biological conclusions. By properly accounting for these error rates in downstream analyses, either by incorporating genotyping errors directly or by recalibrating genotype likelihoods, we were nevertheless able to use the RAD‐seq data to support biologically meaningful and robust inferences of ancestry among Populus hybrids. Based on these findings, we strongly recommend carefully assessing genotyping error rates in reduced representation sequencing experiments, and to properly account for these in downstream analyses, for instance using the tools presented here.  相似文献   

13.

Background

De novo protein modeling approaches utilize 3-dimensional (3D) images derived from electron cryomicroscopy (CryoEM) experiments. The skeleton connecting two secondary structures such as α-helices represent the loop in the 3D image. The accuracy of the skeleton and of the detected secondary structures are critical in De novo modeling. It is important to measure the length along the skeleton accurately since the length can be used as a constraint in modeling the protein.

Results

We have developed a novel computational geometric approach to derive a simplified curve in order to estimate the loop length along the skeleton. The method was tested using fifty simulated density images of helix-loop-helix segments of atomic structures and eighteen experimentally derived density data from Electron Microscopy Data Bank (EMDB). The test using simulated density maps shows that it is possible to estimate within 0.5Å of the expected length for 48 of the 50 cases. The experiments, involving eighteen experimentally derived CryoEM images, show that twelve cases have error within 2Å.

Conclusions

The tests using both simulated and experimentally derived images show that it is possible for our proposed method to estimate the loop length along the skeleton if the secondary structure elements, such as α-helices, can be detected accurately, and there is a continuous skeleton linking the α-helices.
  相似文献   

14.
Combination of conventional histology and the three-dimensional spatial view of tissue structures offers new prospects for understanding and diagnosing nature and development of human diseases. The essential technical problem related to three-dimensional reconstruction in histopathology is represented by the correct alignment of serial sections. During the past years several methods have been proposed but failed to become popular because of their limits in terms of time consume and restricted applicability. We aimed to overcome this problem by applying the technology of Tissue Array, thus by positioning adequate fiducial markers from specific "donor" blocks into the "recipient" paraffin block of interest. Digitized pictures of serially cut sections were aligned according to the tissue markers embedded by Tissue Array, and then processed with specific softwares for three-dimensional reconstruction. Thirteen models, including fetal hearts, breast and thyroid carcinomas, were elaborated. We found the procedure to be easy, fast and reproducible. Moreover, by selectively embedding the fiducial markers according to specific angles, the Tissue Arrays can be exploited in order to establish the distance between sections. This original methodology of incorporating Tissue Arrays into paraffin blocks as fiducial markers for three-dimensional reconstruction has a potential impact on histology for research purposes and diagnostic applications.  相似文献   

15.
Measurements of joint angles during motion analysis are subject to error caused by kinematic crosstalk, that is, one joint rotation (e. g., flexion) being interpreted as another (e.g., abduction). Kinematic crosstalk results from the chosen joint coordinate system being misaligned with the axes about which rotations are assumed to occur. The aim of this paper is to demonstrate that measurement of the so-called "screw-home" motion of the human knee, in which axial rotation and extension are coupled, is especially prone to errors due to crosstalk. The motions of two different two-segment mechanical linkages were examined to study the effects of crosstalk. The segments of the first linkage (NSH) were connected by a revolute joint, but the second linkage (SH) incorporated gearing that caused 15 degrees of screw-home rotation to occur with 90 degrees knee flexion. It was found that rotating the flexion axis (inducing crosstalk) could make linkage NSH appear to exhibit a screw-home motion and that a different rotation of the flexion axis could make linkage SH apparently exhibit pure flexion. These findings suggest that the measurement of screw-home rotation may be strongly influenced by errors in the location of the flexion axis. The magnitudes of these displacements of the flexion axis were consistent with the inter-observer variability seen when five experienced observers defined the flexion axis by palpating the medial and lateral femoral epicondyles. Care should be taken when interpreting small internal-external rotations and abduction-adduction angles to ensure that they are not the products of kinematic crosstalk.  相似文献   

16.
The accuracy of pointing movements performed under different head positions to remembered target locations in 3-D space was studied in healthy persons. The subjects fixated a visual target, then closed their eyes and after 1.0 sec performed the targeted movement with their right arm. The target (a point light source) was presented in random order by a programmable robot arm at one of five space locations. The accuracy of pointing movements was examined in a spherical coordinate system centered in respect with the shoulder of the responding arm. The pointing movements were most accurate under natural eye-head coordination. With the head fixed in the straight-ahead position, both the 3-D absolute error and its standard deviation increased significantly. At the same time, individual components of spatial error (directional and radial) did not change significantly. With the head turned to the rightmost or leftmost position, the pointing accuracy was disturbed within larger limits than under head-fixed condition. The main contributors to the 3-D absolute error were the changes in the azimuth error. The latter depended on the direction of the head-turn: the rightmost turn either increased leftward or decreased rightward shift, and conversely, the left turn increased rightward shift or decreased leftward shift of the target-directed movements.It is suggested that the increased inaccuracy of pointing under head-fixed condition reflected the impairment of the eye-head coordination underlying gaze orientation, and increased inaccuracy under the head-turned condition may be explained by changes in the internal representation of the head and target position in space.Neirofiziologiya/Neurophysiology, Vol. 26, No. 2, pp. 122–131, March–April, 1994.  相似文献   

17.

Background  

In current comparative proteomics studies, the large number of images generated by 2D gels is currently compared using spot matching algorithms. Unfortunately, differences in gel migration and sample variability make efficient spot alignment very difficult to obtain, and, as consequence most of the software alignments return noisy gel matching which needs to be manually adjusted by the user.  相似文献   

18.
Estimating haplotype frequencies becomes increasingly important in the mapping of complex disease genes, as millions of single nucleotide polymorphisms (SNPs) are being identified and genotyped. When genotypes at multiple SNP loci are gathered from unrelated individuals, haplotype frequencies can be accurately estimated using expectation-maximization (EM) algorithms (Excoffier and Slatkin, 1995; Hawley and Kidd, 1995; Long et al., 1995), with standard errors estimated using bootstraps. However, because the number of possible haplotypes increases exponentially with the number of SNPs, handling data with a large number of SNPs poses a computational challenge for the EM methods and for other haplotype inference methods. To solve this problem, Niu and colleagues, in their Bayesian haplotype inference paper (Niu et al., 2002), introduced a computational algorithm called progressive ligation (PL). But their Bayesian method has a limitation on the number of subjects (no more than 100 subjects in the current implementation of the method). In this paper, we propose a new method in which we use the same likelihood formulation as in Excoffier and Slatkin's EM algorithm and apply the estimating equation idea and the PL computational algorithm with some modifications. Our proposed method can handle data sets with large number of SNPs as well as large numbers of subjects. Simultaneously, our method estimates standard errors efficiently, using the sandwich-estimate from the estimating equation, rather than the bootstrap method. Additionally, our method admits missing data and produces valid estimates of parameters and their standard errors under the assumption that the missing genotypes are missing at random in the sense defined by Rubin (1976).  相似文献   

19.
In order to successfully perform the 3D reconstruction in electron tomography, transmission electron microscope images must be accurately aligned or registered. So far, the problem is solved by either manually showing the corresponding fiducial markers from the set of images or automatically using simple correlation between the images on several rotations and scales. The present solutions, however, share the problem of being inefficient and/or inaccurate. We therefore propose a method in which the registration is automated using conventional colloidal gold particles as reference markers between images. We approach the problem from the computer vision viewpoint; hence, the alignment problem is divided into several subproblems: (1) finding initial matches from successive images, (2) estimating the epipolar geometry between consecutive images, (3) finding and localizing the gold particles with subpixel accuracy in each image, (4) predicting the probable matching gold particles using the epipolar constraint and its uncertainty, (5) matching and tracking the gold beads through the tilt series, and (6) optimizing the transformation parameters for the whole image set. The results show not only the reliability of the suggested method but also a high level of accuracy in alignment, since practically all the visible gold markers can be used.  相似文献   

20.
《Biophysical journal》2022,121(15):2906-2920
Single-molecule localization microscopy (SMLM) permits the visualization of cellular structures an order of magnitude smaller than the diffraction limit of visible light, and an accurate, objective evaluation of the resolution of an SMLM data set is an essential aspect of the image processing and analysis pipeline. Here, we present a simple method to estimate the localization spread function (LSF) of a static SMLM data set directly from acquired localizations, exploiting the correlated dynamics of individual emitters and properties of the pair autocorrelation function evaluated in both time and space. The method is demonstrated on simulated localizations, DNA origami rulers, and cellular structures labeled by dye-conjugated antibodies, DNA-PAINT, or fluorescent fusion proteins. We show that experimentally obtained images have LSFs that are broader than expected from the localization precision alone, due to additional uncertainty accrued when localizing molecules imaged over time.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号