首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The study of macromolecular structures has expanded our understanding of the amazing cell machinery and such knowledge has changed how the pharmaceutical industry develops new vaccines in recent years. Traditionally, X-ray crystallography has been the main method for structure determination, however, cryogenic electron microscopy (cryo-EM) has increasingly become more popular due to recent advancements in hardware and software. The number of cryo-EM maps deposited in the EMDataResource (formerly EMDatabase) since 2002 has been dramatically increasing and it continues to do so. De novo macromolecular complex modeling is a labor-intensive process, therefore, it is highly desirable to develop software that can automate this process. Here we discuss our automated, data-driven, and artificial intelligence approaches including map processing, feature extraction, modeling building, and target identification. Recently, we have enabled DNA/RNA modeling in our deep learning-based prediction tool, DeepTracer. We have also developed DeepTracer-ID, a tool that can identify proteins solely based on the cryo-EM map. In this paper, we will present our accumulated experiences in developing deep learning-based methods surrounding macromolecule modeling applications.  相似文献   

2.
In this mini review, we capture the latest progress of applying artificial intelligence (AI) techniques based on deep learning architectures to molecular de novo design with a focus on integration with experimental validation. We will cover the progress and experimental validation of novel generative algorithms, the validation of QSAR models and how AI-based molecular de novo design is starting to become connected with chemistry automation. While progress has been made in the last few years, it is still early days. The experimental validations conducted thus far should be considered proof-of-principle, providing confidence that the field is moving in the right direction.  相似文献   

3.
Generative molecular design for drug discovery and development has seen a recent resurgence promising to improve the efficiency of the design-make-test-analyse cycle; by computationally exploring much larger chemical spaces than traditional virtual screening techniques. However, most generative models thus far have only utilized small-molecule information to train and condition de novo molecule generators. Here, we instead focus on recent approaches that incorporate protein structure into de novo molecule optimization in an attempt to maximize the predicted on-target binding affinity of generated molecules. We summarize these structure integration principles into either distribution learning or goal-directed optimization and for each case whether the approach is protein structure-explicit or implicit with respect to the generative model. We discuss recent approaches in the context of this categorization and provide our perspective on the future direction of the field.  相似文献   

4.
In bifidobacteria, phosphoketolase (PKT) plays a key role in the central hexose fermentation pathway called “bifid shunt.” The three-dimensional structure of PKT from Bifidobacterium longum with co-enzyme thiamine diphosphate (ThDpp) was determined at 2.1 Å resolution by cryo-EM single-particle analysis using 196,147 particles to build up the structural model of a PKT octamer related by D4 symmetry. Although the cryo-EM structure of PKT was almost identical to the X-ray crystal structure previously determined at 2.2 Å resolution, several interesting structural features were observed in the cryo-EM structure. Because this structure was solved at relatively high resolution, it was observed that several amino acid residues adopt multiple conformations. Among them, Q546–D547–H548–N549 (the QN-loop) demonstrate the largest structural change, which seems to be related to the enzymatic function of PKT. The QN-loop is at the entrance to the substrate binding pocket. The minor conformer of the QN-loop is similar to the conformation of the QN-loop in the crystal structure. The major conformer is located further from ThDpp than the minor conformer. Interestingly, the major conformer in the cryo-EM structure of PKT resembles the corresponding loop structure of substrate-bound Escherichia coli transketolase. That is, the minor and major conformers may correspond to “closed” and “open” states for substrate access, respectively. Moreover, because of the high-resolution analysis, many water molecules were observed in the cryo-EM structure of PKT. Structural features of the water molecules in the cryo-EM structure are discussed and compared with water molecules observed in the crystal structure.  相似文献   

5.
Cryogenic electron microscopy (cryo-EM) is now one of the most powerful and widely used methods to determine high-resolution structures of macromolecules. A major bottleneck of cryo-EM is to prepare high-quality vitrified specimen, which still faces many practical challenges. During the conventional vitrification process, macromolecules tend to adsorb at the air–water interface (AWI), which is known unfriendly to biological samples. In this review, we outline the nature of AWI and the problems caused by it, such as unpredictable or uneven particle distribution, protein denaturation, dissociation of complex and preferential orientation. We review and discuss the approaches and underlying mechanisms to deal with AWI: 1) Additives, exemplified by detergents, forming a protective layer at AWI and thus preserving the native folds of target macromolecules. 2) Fast vitrification devices based on the idea to freeze in-solution macromolecules before their touching of AWI. 3) Thin layer of continuous supporting films to adsorb macromolecules, and when functionalized with affinity ligands, to specifically anchor the target particles away from the AWI. Among these supporting films, graphene, together with its derivatives, with negligible background noise and mechanical robustness, has emerged as a new generation of support. These strategies have been proven successful in various cases and enable us a better handling of the problems caused by the AWI in cryo-EM specimen preparation.  相似文献   

6.
DNA replication has been reconstituted in vitro with yeast proteins, and the minimal system requires the coordinated assembly of 16 distinct replication factors, consisting of 42 polypeptides. To understand the molecular interplay between these factors at the single residue level, new structural biology tools are being developed. Inspired by advances in single-molecule fluorescence imaging and cryo-tomography, novel single-particle cryo-EM experiments have been used to characterise the structural mechanism for the loading of the replicative helicase. Here, we discuss how in silico reconstitution of single-particle cryo-EM data can help describe dynamic systems that are difficult to approach with conventional three-dimensional classification tools.  相似文献   

7.
Maize diseases are a major source of yield loss, but due to the lack of human experience and limitations of traditional image-recognition technology, obtaining satisfactory large-scale identification results of maize diseases are difficult. Fortunately, the advancement of deep learning-based technology makes it possible to automatically identify diseases. However, it still faces issues caused by small sample sizes and complex field background, which affect the accuracy of disease identification. To address these issues, a deep learning-based method was proposed for maize disease identification in this paper. DenseNet121 was used as the main extraction network and a multi-dilated-CBAM-DenseNet (MDCDenseNet) model was built by combining the multi-dilated module and convolutional block attention module (CBAM) attention mechanism. Five models of MDCDenseNet, DenseNet121, ResNet50, MobileNetV2, and NASNetMobile were compared and tested using three kinds of maize leave images from the PlantVillage dataset and field-collected at Northeast Agricultural University in China. Furthermore, auxiliary classifier generative adversarial network (ACGAN) and transfer learning were used to expand the dataset and pre-train for optimal identification results. When tested on field-collected datasets with a complex background, the MDCDenseNet model outperformed compared to these models with an accuracy of 98.84%. Therefore, it can provide a viable reference for the identification of maize leaf diseases collected from the farmland with a small sample size and complex background.  相似文献   

8.
Pseudomonas phages are increasingly important biomedicines for phage therapy, but little is known about how these viruses package DNA. This paper explores the terminase subunits from the Myoviridae E217, a Pseudomonas-phage used in an experimental cocktail to eradicate P. aeruginosa in vitro and in animal models. We identified the large (TerL) and small (TerS) terminase subunits in two genes ~58 kbs away from each other in the E217 genome. TerL presents a classical two-domain architecture, consisting of an N-terminal ATPase and C-terminal nuclease domain arranged into a bean-shaped tertiary structure. A 2.05 Å crystal structure of the C-terminal domain revealed an RNase H-like fold with two magnesium ions in the nuclease active site. Mutations in TerL residues involved in magnesium coordination had a dominant-negative effect on phage growth. However, the two ions identified in the active site were too far from each other to promote two-metal-ion catalysis, suggesting a conformational change is required for nuclease activity. We also determined a 3.38 Å cryo-EM reconstruction of E217 TerS that revealed a ring-like decamer, departing from the most common nonameric quaternary structure observed thus far. E217 TerS contains both N-terminal helix-turn-helix motifs enriched in basic residues and a central channel lined with basic residues large enough to accommodate double-stranded DNA. Overexpression of TerS caused a more than a 4-fold reduction of E217 burst size, suggesting a catalytic amount of the protein is required for packaging. Together, these data expand the molecular repertoire of viral terminase subunits to Pseudomonas-phages used for phage therapy.  相似文献   

9.
Machine learning or deep learning models have been widely used for taxonomic classification of metagenomic sequences and many studies reported high classification accuracy. Such models are usually trained based on sequences in several training classes in hope of accurately classifying unknown sequences into these classes. However, when deploying the classification models on real testing data sets, sequences that do not belong to any of the training classes may be present and are falsely assigned to one of the training classes with high confidence. Such sequences are referred to as out-of-distribution (OOD) sequences and are ubiquitous in metagenomic studies. To address this problem, we develop a deep generative model-based method, MLR-OOD, that measures the probability of a testing sequencing belonging to OOD by the likelihood ratio of the maximum of the in-distribution (ID) class conditional likelihoods and the Markov chain likelihood of the testing sequence measuring the sequence complexity. We compose three different microbial data sets consisting of bacterial, viral, and plasmid sequences for comprehensively benchmarking OOD detection methods. We show that MLR-OOD achieves the state-of-the-art performance demonstrating the generality of MLR-OOD to various types of microbial data sets. It is also shown that MLR-OOD is robust to the GC content, which is a major confounding effect for OOD detection of genomic sequences. In conclusion, MLR-OOD will greatly reduce false positives caused by OOD sequences in metagenomic sequence classification.  相似文献   

10.
Magnesium ions (Mg2+) are the most abundant divalent cations in living organisms and are essential for various physiological processes, including ATP utilization and the catalytic activity of numerous enzymes. Therefore, the homeostatic mechanisms associated with cellular Mg2+ are crucial for both eukaryotic and prokaryotic organisms and are thus strictly controlled by Mg2+ channels and transporters. Technological advances in structural biology, such as the expression screening of membrane proteins, in meso phase crystallization, and recent cryo-EM techniques, have enabled the structure determination of numerous Mg2+ channels and transporters. In this review article, we provide an overview of the families of Mg2+ channels and transporters (MgtE/SLC41, TRPM6/7, CorA/Mrs2, CorC/CNNM), and discuss the structural biology prospects based on the known structures of MgtE, TRPM7, CorA and CorC.  相似文献   

11.
BackgroundTo date, EVs characterization techniques are extremely diverse. The contribution of AFM, in particular, is often confined to size distribution. While AFM provides a unique possibility to carry out measurements in situ, nanomechanical characterization of EVs is still missing.MethodsBlood plasma EVs were isolated by ultracentrifugation, analyzed by flow cytometry and NTA. Followed by cryo-EM, we applied PeakForce AFM to assess morphological and nanomechanical properties of EVs in liquid.ResultsNanoparticles were subdivided by their size estimated for their suspended state into sub-sets of small S1-EVs (< 30 nm), S2-EVs (30–50 nm), and sub-set of large ones L-EVs (50–170 nm). Non-membranous S1-EVs were distinguished by higher Young's modulus (10.33(7.36;15.25) MPa) and were less deformed by AFM tip (3.6(2.8;4.4) nm) compared to membrane exosomes S2-EVs (6.25(4.52;8.24) MPa and 4.8(4.3;5.9) nm). L-EVs were identified as large membrane exosomes, heterogeneous by their nanomechanical properties (22.43(8.26;53.11) MPa and 3.57(2.07;7.89) nm). Nanomechanical mapping revealed a few non-deformed L-EVs, of which Young's modulus rose up to 300 MPa. Taken together with cryo-EM, these results lead us to the suggestion that two or more vesicles could be contained inside a large one being a multilayer vesicle.ConclusionsWe identified particles similar in morphology and showed differences in nanomechanical properties that could be attributed to the features of their inner structure.General significanceOur results further elucidate the identification of EVs and concomitant nanoparticles based on their nanomechanical properties.  相似文献   

12.
Revealing high-resolution structures of microtubule-associated proteins (MAPs) is critical for understanding their fundamental roles in various cellular activities, such as cell motility and intracellular cargo transport. Nevertheless, large flexible molecular motors that dynamically bind and release microtubule networks are challenging for cryo-electron microscopy (cryo-EM). Traditional structure determination of MAPs bound to microtubules needs alignment information from the reconstruction of microtubules, which cannot be readily applied to large MAPs without a fixed binding pattern. Here, we developed a comprehensive approach to estimate the microtubule networks (multi-curve fitting), model the tubulin-lattice signals, and remove them (tubulin-lattice subtraction) from the raw cryo-EM micrographs. The approach does not require an ordered binding pattern of MAPs on microtubules, nor does it need a reconstruction of the microtubules. We demonstrated the capability of our approach using the reconstituted outer-arm dynein (OAD) bound to microtubule doublets. The tubulin-lattice subtraction improves the OAD alignment, thus leading to high-resolution reconstructions. In addition, the multi-curve fitting approach provides an accurate automatic alternative method to pick or segment filaments in 2D images and potentially in 3D tomograms. The accuracy of our approach has been demonstrated by using several other biological filaments. Our work provides a new tool to determine high-resolution structures of large MAPs bound to curved microtubule networks.  相似文献   

13.
Eukaryotic post-translational arginylation, mediated by the family of enzymes known as the arginyltransferases (ATE1s), is an important post-translational modification that can alter protein function and even dictate cellular protein half-life. Multiple major biological pathways are linked to the fidelity of this process, including neural and cardiovascular developments, cell division, and even the stress response. Despite this significance, the structural, mechanistic, and regulatory mechanisms that govern ATE1 function remain enigmatic. To that end, we have used X-ray crystallography to solve the crystal structure of ATE1 from the model organism Saccharomyces cerevisiae ATE1 (ScATE1) in the apo form. The three-dimensional structure of ScATE1 reveals a bilobed protein containing a GCN5-related N-acetyltransferase (GNAT) fold, and this crystalline behavior is faithfully recapitulated in solution based on size-exclusion chromatography-coupled small angle X-ray scattering (SEC-SAXS) analyses and cryo-EM 2D class averaging. Structural superpositions and electrostatic analyses point to this domain and its domain-domain interface as the location of catalytic activity and tRNA binding, and these comparisons strongly suggest a mechanism for post-translational arginylation. Additionally, our structure reveals that the N-terminal domain, which we have previously shown to bind a regulatory [Fe-S] cluster, is dynamic and disordered in the absence of metal bound in this location, hinting at the regulatory influence of this region. When taken together, these insights bring us closer to answering pressing questions regarding the molecular-level mechanism of eukaryotic post-translational arginylation.  相似文献   

14.
Generative deep learning is accelerating de novo drug design, by allowing the generation of molecules with desired properties on demand. Chemical language models – which generate new molecules in the form of strings using deep learning – have been particularly successful in this endeavour. Thanks to advances in natural language processing methods and interdisciplinary collaborations, chemical language models are expected to become increasingly relevant in drug discovery. This minireview provides an overview of the current state-of-the-art of chemical language models for de novo design, and analyses current limitations, challenges, and advantages. Finally, a perspective on future opportunities is provided.  相似文献   

15.
Deep generative models have gained recent popularity for chemical design. Many of these models have historically operated in 2D space; however, more recently explicit 3D molecular generative models have become of interest, which are the topic of this article. Dozens of published models have been developed in the last few years to generate molecules directly in 3D, outputting both the atom types and coordinates, either in one-shot or adding atoms or fragments step-by-step. These 3D generative models can also be guided by structural information such as a binding pocket representation to successfully generate molecules with docking score ranges similar to known actives, but still showing lower computational efficiency and generation throughput than 1D/2D generative models and sometimes producing unrealistic conformations. We advocate for a unified benchmark of metrics to evaluate generation and propose perspectives to be addressed in next implementations.  相似文献   

16.
PDBx/mmCIF, Protein Data Bank Exchange (PDBx) macromolecular Crystallographic Information Framework (mmCIF), has become the data standard for structural biology. With its early roots in the domain of small-molecule crystallography, PDBx/mmCIF provides an extensible data representation that is used for deposition, archiving, remediation, and public dissemination of experimentally determined three-dimensional (3D) structures of biological macromolecules by the Worldwide Protein Data Bank (wwPDB, wwpdb.org). Extensions of PDBx/mmCIF are similarly used for computed structure models by ModelArchive (modelarchive.org), integrative/hybrid structures by PDB-Dev (pdb-dev.wwpdb.org), small angle scattering data by Small Angle Scattering Biological Data Bank SASBDB (sasbdb.org), and for models computed generated with the AlphaFold 2.0 deep learning software suite (alphafold.ebi.ac.uk). Community-driven development of PDBx/mmCIF spans three decades, involving contributions from researchers, software and methods developers in structural sciences, data repository providers, scientific publishers, and professional societies. Having a semantically rich and extensible data framework for representing a wide range of structural biology experimental and computational results, combined with expertly curated 3D biostructure data sets in public repositories, accelerates the pace of scientific discovery. Herein, we describe the architecture of the PDBx/mmCIF data standard, tools used to maintain representations of the data standard, governance, and processes by which data content standards are extended, plus community tools/software libraries available for processing and checking the integrity of PDBx/mmCIF data. Use cases exemplify how the members of the Worldwide Protein Data Bank have used PDBx/mmCIF as the foundation for its pipeline for delivering Findable, Accessible, Interoperable, and Reusable (FAIR) data to many millions of users worldwide.  相似文献   

17.
18.
PurposeEvaluation of a deep learning approach for the detection of meniscal tears and their characterization (presence/absence of migrated meniscal fragment).MethodsA large annotated adult knee MRI database was built combining medical expertise of radiologists and data scientists’ tools. Coronal and sagittal proton density fat suppressed-weighted images of 11,353 knee MRI examinations (10,401 individual patients) paired with their standardized structured reports were retrospectively collected. After database curation, deep learning models were trained and validated on a subset of 8058 examinations. Algorithm performance was evaluated on a test set of 299 examinations reviewed by 5 musculoskeletal specialists and compared to general radiologists’ reports. External validation was performed using the publicly available MRNet database. Receiver Operating Characteristic (ROC) curves results and Area Under the Curve (AUC) values were obtained on internal and external databases.ResultsA combined architecture of meniscal localization and lesion classification 3D convolutional neural networks reached AUC values of 0.93 (95% CI 0.82, 0.95) for medial and 0.84 (95% CI 0.78, 0.89) for lateral meniscal tear detection, and 0.91 (95% CI 0.87, 0.94) for medial and 0.95 (95% CI 0.92, 0.97) for lateral meniscal tear migration detection. External validation of the combined medial and lateral meniscal tear detection models resulted in an AUC of 0.83 (95% CI 0.75, 0.90) without further training and 0.89 (95% CI 0.82, 0.95) with fine tuning.ConclusionOur deep learning algorithm demonstrated high performance in knee menisci lesion detection and characterization, validated on an external database.  相似文献   

19.
The natural populations of Dactylorhiza hatagirea have been greatly affected due to incessant exploitation. As such, studies on its population attributes together with habitat suitability and environmental factors affecting its distribution are needed to be undertaken for its conservation in nature. Present study aimed at accessing an impact of anthropogenic pressure on population structure and locate suitable habitats for the conservation of this critically endangered orchid. Considerable changes in the phytosociological attributes were observed on account of the changing magnitude and extent of anthropogenic threat in their natural abode. The distribution pattern of species indicated that more than 90% of the populations exhibit substantially aggregated spatial distribution. Maximum Entropy (MaxEnt) distribution modelling algorithm was used to predict suitable habitat and potential area for its cultivation and reintroduction. Twenty-seven occurrence records, nineteen bioclimatic variables, altitude, and slope were used. MaxEnt map output gave the habitat suitability for this species and predicted its distribution in the North-Western Himalayas of India for approximately 616 km2. Jackknifing indicated that maximum temperature of warmest month, annual mean temperature, mean temperature of the driest quarter, and mean temperature of the wettest quarter were the governing factors for its distribution and hence, presented a higher gain with respect to other variables. According to permutation importance, precipitation seasonality and mean temperature of wettest quarter shows the prominent impact on the habitat distribution. Results of AUC (area under curve) were statistically significant (0.940) and the line of predicted omission falls very close to an omission on training samples, validating a better run of the model. Response curves revealed a probable increase in the occurrence of D. hatagirea with an increase in mean temperature of the wettest quarter and maximum temperature of the warmest month contributed more than 50% to predicted habitat suitability. Direct field observations concurrent with predicted habitat suitability and google-earth images represent greater model thresholds for successful inception of the species. Together, the study proposes that the species can be conserved in or near its present-day natural habitats and is equally effective in determining the possible habitats for its cultivation and reintroduction.  相似文献   

20.
In-depth structural characterization of lipids is an essential component of lipidomics. There has been a rapid expansion of mass spectrometry methods that are capable of resolving lipid isomers at various structural levels over the past decade. These developments finally make deep-lipidotyping possible, which provides new means to study lipid metabolism and discover new lipid biomarkers. In this review, we discuss recent advancements in tandem mass spectrometry (MS/MS) methods for identification of complex lipids beyond the species (known headgroup information) and molecular species (known chain composition) levels. These include identification at the levels of carbon-carbon double bond (C=C) location and sn-position, as well as characterization of acyl chain modifications. We also discuss the integration of isomer-resolving MS/MS methods with different lipid analysis workflows and their applications in lipidomics. The results showcase the distinct capabilities of deep-lipidotyping in untangling the metabolism of individual isomers and sensitive phenotyping by using relative fractional quantitation of the isomers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号