共查询到20条相似文献,搜索用时 0 毫秒
1.
Background
Although structural domains in proteins (SDs) are important, half of the regions in the human proteome are currently left with no SD assignments. These unassigned regions consist not only of novel SDs, but also of intrinsically disordered (ID) regions since proteins, especially those in eukaryotes, generally contain a significant fraction of ID regions. As ID regions can be inferred from amino acid sequences, a method that combines SD and ID region assignments can determine the fractions of SDs and ID regions in any proteome.Results
In contrast to other available ID prediction programs that merely identify likely ID regions, the DICHOT system we previously developed classifies the entire protein sequence into SDs and ID regions. Application of DICHOT to the human proteome revealed that residue-wise ID regions constitute 35%, SDs with similarity to PDB structures comprise 52%, while SDs with no similarity to PDB structures account for the remaining 13%. The last group consists of novel structural domains, termed cryptic domains, which serve as good targets of structural genomics. The DICHOT method applied to the proteomes of other model organisms indicated that eukaryotes generally have high ID contents, while prokaryotes do not. In human proteins, ID contents differ among subcellular localizations: nuclear proteins had the highest residue-wise ID fraction (47%), while mitochondrial proteins exhibited the lowest (13%). Phosphorylation and O-linked glycosylation sites were found to be located preferentially in ID regions. As O-linked glycans are attached to residues in the extracellular regions of proteins, the modification is likely to protect the ID regions from proteolytic cleavage in the extracellular environment. Alternative splicing events tend to occur more frequently in ID regions. We interpret this as evidence that natural selection is operating at the protein level in alternative splicing.Conclusions
We classified entire regions of proteins into the two categories, SDs and ID regions and thereby obtained various kinds of complete genome-wide statistics. The results of the present study are important basic information for understanding protein structural architectures and have been made publicly available at http://spock.genes.nig.ac.jp/~genome/DICHOT. 相似文献2.
3.
4.
5.
6.
7.
8.
Stuhlmeier KM 《Biochimica et biophysica acta》2001,1524(1):57-65
Quinacrine has been used for decades and the beneficial effects of this drug are as numerous as its toxic effects. Since endothelial cells (EC) are in many cases the first cells coming in contact with drugs, the effect of quinacrine on certain aspects of EC biology were studied. The presented data demonstrate that quinacrine can have a marked impact on the integrity on EC monolayer without grossly interfering with cell viability. The described impact of quinacrine on EC might explain, at least in part, the toxic effects of this drug observed in the past. Furthermore, quinacrine profoundly effects gene regulation in EC. Quinacrine binds to DNA in a sequence-specific manner. While NF-kappa B-DNA interactions are not effected, AP-1-DNA binding is blocked by quinacrine. Such differential effects are presumably due to intercalation of quinacrine into the AP-1 consensus element. Preincubation of oligonucleotides resembling this sequence blocked the subsequent binding of nuclear extract containing AP-1 protein(s). Taken together, these data suggest that quinacrine interferes with EC physiology and alters the repertoire of EC to respond to stimuli. Furthermore, the differential effects of quinacrine might be exploited to study and gain additional insight in the involvement of AP-1 and NF-kappa B in gene regulation. 相似文献
9.
虚拟筛选是在计算机上对化合物分子进行模拟预筛选,找出容易和药物靶标结合的小分子(配体),从而降低实际实验测试次数,提高药物先导化合物的发现效率。常用的分子对接软件可以用于基于结构的虚拟筛选,寻找配体与靶标的最佳的作用模式和结合构象,并通过打分函数来筛选出潜在的配体。现有的对接软件如AutoDock Vina等在分子对接过程中需要耗费大量时间和计算资源,特别是面对大规模分子对接时,过长的筛选时间不能满足应用需求,因此,本文在最高效的QVina2对接软件基础上,提出一种基于GPU的QVina 2并行化方法QVina2-GPU,利用GPU硬件高度并行体系加速分子对接。具体包括增加初始化分子构象数量,以扩展蒙特卡罗的迭代局部搜索中线程的并行规模,增加蒙特卡罗的迭代搜索的广度以减少每次蒙特卡罗迭代搜索深度,并利用Wolfe-Powell准则改进局部搜索算法,提高了对接精度,进一步减少蒙特卡罗迭代搜索深度,最后,在NVIDIA Geforce RTX 3090平台上在公开的配体数据库上验证了QVina2-GPU的性能,实验表明在保证分子对接精度的基础上,我们提出的QVina2-GPU对Qvina2的平均加速比达到5.18倍,最大加速比达到12.28倍。 相似文献
10.
Background
The processing of images acquired through microscopy is a challenging task due to the large size of datasets (several gigabytes) and the fast turnaround time required. If the throughput of the image processing stage is significantly increased, it can have a major impact in microscopy applications.Results
We present a high performance computing (HPC) solution to this problem. This involves decomposing the spatial 3D image into segments that are assigned to unique processors, and matched to the 3D torus architecture of the IBM Blue Gene/L machine. Communication between segments is restricted to the nearest neighbors. When running on a 2 Ghz Intel CPU, the task of 3D median filtering on a typical 256 megabyte dataset takes two and a half hours, whereas by using 1024 nodes of Blue Gene, this task can be performed in 18.8 seconds, a 478× speedup.Conclusion
Our parallel solution dramatically improves the performance of image processing, feature extraction and 3D reconstruction tasks. This increased throughput permits biologists to conduct unprecedented large scale experiments with massive datasets.11.
12.
Fernández JJ 《Journal of structural biology》2008,164(1):1-6
Computational advances have significantly contributed to the current role of electron cryomicroscopy (cryoEM) in structural biology. The needs for computational power are constantly growing with the increasing complexity of algorithms and the amount of data needed to push the resolution limits. High performance computing (HPC) is becoming paramount in cryoEM to cope with those computational needs. Since the nineties, different HPC strategies have been proposed for some specific problems in cryoEM and, in fact, some of them are already available in common software packages. Nevertheless, the literature is scattered in the areas of computer science and structural biology. In this communication, the HPC approaches devised for the computation-intensive tasks in cryoEM (single particles and tomography) are retrospectively reviewed and the future trends are discussed. Moreover, the HPC capabilities available in the most common cryoEM packages are surveyed, as an evidence of the importance of HPC in addressing the future challenges. 相似文献
13.
Models of gene regulatory networks (GRNs) attempt to explain the complex processes that determine cells' behavior, such as differentiation, metabolism, and the cell cycle. The advent of high-throughput data generation technologies has allowed researchers to fit theoretical models to experimental data on gene-expression profiles. GRNs are often represented using logical models. These models require that real-valued measurements be converted to discrete levels, such as on/off, but the discretization often introduces inconsistencies into the data. Dimitrova et al. posed the problem of efficiently finding a parsimonious resolution of the introduced inconsistencies. We show that reconstruction of a logical GRN that minimizes the errors is NP-complete, so that an efficient exact algorithm for the problem is not likely to exist. We present a probabilistic formulation of the problem that circumvents discretization of expression data. We phrase the problem of error reduction as a minimum entropy problem, develop a heuristic algorithm for it, and evaluate its performance on mouse embryonic stem cell data. The constructed model displays high consistency with prior biological knowledge. Despite the oversimplification of a discrete model, we show that it is superior to raw experimental measurements and demonstrates a highly significant level of identical regulatory logic among co-regulated genes. A software implementing the method is freely available at: http://acgt.cs.tau.ac.il/modent. 相似文献
14.
Discovering small molecules that interact with protein targets will be a key part of future drug discovery efforts. Molecular docking of drug-like molecules is likely to be valuable in this field; however, the great number of such molecules makes the potential size of this task enormous. In this paper, a method to screen small molecular databases using cloud computing is proposed. This method is called the hierarchical method for molecular docking and can be completed in a relatively short period of time. In this method, the optimization of molecular docking is divided into two subproblems based on the different effects on the protein–ligand interaction energy. An adaptive genetic algorithm is developed to solve the optimization problem and a new docking program (FlexGAsDock) based on the hierarchical docking method has been developed. The implementation of docking on a cloud computing platform is then discussed. The docking results show that this method can be conveniently used for the efficient molecular design of drugs. 相似文献
15.
Computational methods have been used in biology for sequence analysis (bioinformatics), all-atom simulation (molecular dynamics and quantum calculations), and more recently for modeling biological networks (systems biology). Of these three techniques, all-atom simulation is currently the most computationally demanding, in terms of compute load, communication speed, and memory load. Breakthroughs in electrostatic force calculation and dynamic load balancing have enabled molecular dynamics simulations of large biomolecular complexes. Here, we report simulation results for the ribosome, using approximately 2.64 million atoms, the largest all-atom biomolecular simulation published to date. Several other nano-scale systems with different numbers of atoms were studied to measure the performance of the NAMD molecular dynamics simulation program on the Los Alamos National Laboratory Q Machine. We demonstrate that multimillion atom systems represent a 'sweet spot' for the NAMD code on large supercomputers. NAMD displays an unprecedented 85% parallel scaling efficiency for the ribosome system on 1024 CPUs. We also review recent targeted molecular dynamics simulations of the ribosome that prove useful for studying conformational changes of this large biomolecular complex in atomic detail. 相似文献
16.
17.
Bustamam A Burrage K Hamilton NA 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2012,9(3):679-692
Markov clustering (MCL) is becoming a key algorithm within bioinformatics for determining clusters in networks. However,with increasing vast amount of data on biological networks, performance and scalability issues are becoming a critical limiting factor in applications. Meanwhile, GPU computing, which uses CUDA tool for implementing a massively parallel computing environment in the GPU card, is becoming a very powerful, efficient, and low-cost option to achieve substantial performance gains over CPU approaches. The use of on-chip memory on the GPU is efficiently lowering the latency time, thus, circumventing a major issue in other parallel computing environments, such as MPI. We introduce a very fast Markov clustering algorithm using CUDA (CUDA-MCL) to perform parallel sparse matrix-matrix computations and parallel sparse Markov matrix normalizations, which are at the heart of MCL. We utilized ELLPACK-R sparse format to allow the effective and fine-grain massively parallel processing to cope with the sparse nature of interaction networks data sets in bioinformatics applications. As the results show, CUDA-MCL is significantly faster than the original MCL running on CPU. Thus, large-scale parallel computation on off-the-shelf desktop-machines, that were previously only possible on supercomputing architectures, can significantly change the way bioinformaticians and biologists deal with their data. 相似文献
18.
Recently, the graphic processing units (GPUs) are becoming increasingly popular for the high performance computing applications. Although the GPUs provide high peak performance, exploiting the full performance potential for application programs, however, leaves a challenging task to the programmers. When launching a parallel kernel of an application on the GPU, the programmer needs to carefully select the number of blocks (grid size) and the number of threads per block (block size). These values determine the degree of SIMD parallelism and the multithreading, and greatly influence the performance. With a huge range of possible combinations of these values, choosing the right grid size and the block size is not straightforward. In this paper, we propose a mathematical model for tuning the grid size and the block size based on the GPU architecture parameters. Using our model we first calculate a small set of candidate grid size and block size values, then search for the optimal values out of the candidate values through experiments. Our approach significantly reduces the potential search space instead of exhaustive search approaches in the previous research. Thus our approach can be practically applied to the real applications. 相似文献
19.
20.
Iterative applications are known to run as slow as their slowest computational component. This paper introduces malleability, a new dynamic reconfiguration strategy to overcome this limitation. Malleability is the ability to dynamically change the
data size and number of computational entities in an application. Malleability can be used by middleware to autonomously reconfigure
an application in response to dynamic changes in resource availability in an architecture-aware manner, allowing applications
to optimize the use of multiple processors and diverse memory hierarchies in heterogeneous environments.
The modular Internet Operating System (IOS) was extended to reconfigure applications autonomously using malleability. Two
different iterative applications were made malleable. The first is used in astronomical modeling, and representative of maximum-likelihood
applications was made malleable in the SALSA programming language. The second models the diffusion of heat over a two dimensional
object, and is representative of applications such as partial differential equations and some types of distributed simulations.
Versions of the heat application were made malleable both in SALSA and MPI. Algorithms for concurrent data redistribution
are given for each type of application. Results show that using malleability for reconfiguration is 10 to 100 times faster
on the tested environments. The algorithms are also shown to be highly scalable with respect to the quantity of data involved.
While previous work has shown the utility of dynamically reconfigurable applications using only computational component migration,
malleability is shown to provide up to a 15% speedup over component migration alone on a dynamic cluster environment.
This work is part of an ongoing research effort to enable applications to be highly reconfigurable and autonomously modifiable
by middleware in order to efficiently utilize distributed environments. Grid computing environments are becoming increasingly
heterogeneous and dynamic, placing new demands on applications’ adaptive behavior. This work shows that malleability is a
key aspect in enabling effective dynamic reconfiguration of iterative applications in these environments.
相似文献
Carlos A. VarelaEmail: |