首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Purpose

To evaluate the ability of nm-scaled iron oxide particles conjugated with Azure A, a classic histological dye, to accumulate in areas of angiogenesis in a recently developed murine angiogenesis model.

Materials and methods

We characterised the Azure A particles with regard to their hydrodynamic size, zeta potential, and blood circulation half-life. The particles were then investigated by Magnetic Resonance Imaging (MRI) in a recently developed murine angiogenesis model along with reference particles (Ferumoxtran-10) and saline injections.

Results

The Azure A particles had a mean hydrodynamic diameter of 51.8 ± 43.2 nm, a zeta potential of −17.2 ± 2.8 mV, and a blood circulation half-life of 127.8 ± 74.7 min. Comparison of MR images taken pre- and 24-h post-injection revealed a significant increase in R2* relaxation rates for both Azure A and Ferumoxtran-10 particles. No significant difference was found for the saline injections. The relative increase was calculated for the three groups, and showed a significant difference between the saline group and the Azure A group, and between the saline group and the Ferumoxtran-10 group. However, no significant difference was found between the two particle groups.

Conclusion

Ultrahigh-field MRI revealed localisation of both types of iron oxide particles to areas of neovasculature. However, the Azure A particles did not show any enhanced accumulation relative to Ferumoxtran-10, suggesting the accumulation in both cases to be passive.  相似文献   

2.

Background

Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of ‘omics’ data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis.

Results

We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions.

Conclusions

Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0386-y) contains supplementary material, which is available to authorized users.  相似文献   

3.

Background

Today researchers can choose from many bioinformatics protocols for all types of life sciences research, computational environments and coding languages. Although the majority of these are open source, few of them possess all virtues to maximize reuse and promote reproducible science. Wikipedia has proven a great tool to disseminate information and enhance collaboration between users with varying expertise and background to author qualitative content via crowdsourcing. However, it remains an open question whether the wiki paradigm can be applied to bioinformatics protocols.

Results

We piloted PyPedia, a wiki where each article is both implementation and documentation of a bioinformatics computational protocol in the python language. Hyperlinks within the wiki can be used to compose complex workflows and induce reuse. A RESTful API enables code execution outside the wiki. Initial content of PyPedia contains articles for population statistics, bioinformatics format conversions and genotype imputation. Use of the easy to learn wiki syntax effectively lowers the barriers to bring expert programmers and less computer savvy researchers on the same page.

Conclusions

PyPedia demonstrates how wiki can provide a collaborative development, sharing and even execution environment for biologists and bioinformaticians that complement existing resources, useful for local and multi-center research teams.

Availability

PyPedia is available online at: http://www.pypedia.com. The source code and installation instructions are available at: https://github.com/kantale/PyPedia_server. The PyPedia python library is available at: https://github.com/kantale/pypedia. PyPedia is open-source, available under the BSD 2-Clause License.
  相似文献   

4.

Background

Bioinformatics applications are now routinely used to analyze large amounts of data. Application development often requires many cycles of optimization, compiling, and testing. Repeatedly loading large datasets can significantly slow down the development process. We have incorporated HotSwap functionality into the protein workbench STRAP, allowing developers to create plugins using the Java HotSwap technique.

Results

Users can load multiple protein sequences or structures into the main STRAP user interface, and simultaneously develop plugins using an editor of their choice such as Emacs. Saving changes to the Java file causes STRAP to recompile the plugin and automatically update its user interface without requiring recompilation of STRAP or reloading of protein data. This article presents a tutorial on how to develop HotSwap plugins. STRAP is available at http://strapjava.de and http://www.charite.de/bioinf/strap.

Conclusion

HotSwap is a useful and time-saving technique for bioinformatics developers. HotSwap can be used to efficiently develop bioinformatics applications that require loading large amounts of data into memory.  相似文献   

5.
6.
7.
Computation has become a critical component of research in biology. A risk has emerged that computational and programming challenges may limit research scope, depth, and quality. We review various solutions to common computational efficiency problems in ecological and evolutionary research. Our review pulls together material that is currently scattered across many sources and emphasizes those techniques that are especially effective for typical ecological and environmental problems. We demonstrate how straightforward it can be to write efficient code and implement techniques such as profiling or parallel computing. We supply a newly developed R package (aprof) that helps to identify computational bottlenecks in R code and determine whether optimization can be effective. Our review is complemented by a practical set of examples and detailed Supporting Information material (S1S3 Texts) that demonstrate large improvements in computational speed (ranging from 10.5 times to 14,000 times faster). By improving computational efficiency, biologists can feasibly solve more complex tasks, ask more ambitious questions, and include more sophisticated analyses in their research.
This is part of the PLOS Computational Biology Education collection.
  相似文献   

8.
Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research—translating basic science results into new interventions—and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

What to Learn in This Chapter

Text mining is an established field, but its application to translational bioinformatics is quite new and it presents myriad research opportunities. It is made difficult by the fact that natural (human) language, unlike computer language, is characterized at all levels by rampant ambiguity and variability. Important sub-tasks include gene name recognition, or finding mentions of gene names in text; gene normalization, or mapping mentions of genes in text to standard database identifiers; phenotype recognition, or finding mentions of phenotypes in text; and phenotype normalization, or mapping mentions of phenotypes to concepts in ontologies. Text mining for translational bioinformatics can necessitate dealing with two widely varying genres of text—published journal articles, and prose fields in electronic medical records. Research into the latter has been impeded for years by lack of public availability of data sets, but this has very recently changed and the field is poised for rapid advances. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.
This article is part of the “Translational Bioinformatics” collection for PLOS Computational Biology.
  相似文献   

9.

Background

Each year more than 10 million people worldwide are burned severely enough to require medical attention, with clinical outcomes noticeably worse in resource poor settings. Expert clinical advice on acute injuries can play a determinant role and there is a need for novel approaches that allow for timely access to advice. We developed an interactive mobile phone application that enables transfer of both patient data and pictures of a wound from the point-of-care to a remote burns expert who, in turn, provides advice back.

Methods and Results

The application is an integrated clinical decision support system that includes a mobile phone application and server software running in a cloud environment. The client application is installed on a smartphone and structured patient data and photographs can be captured in a protocol driven manner. The user can indicate the specific injured body surface(s) through a touchscreen interface and an integrated calculator estimates the total body surface area that the burn injury affects. Predefined standardised care advice including total fluid requirement is provided immediately by the software and the case data are relayed to a cloud server. A text message is automatically sent to a burn expert on call who then can access the cloud server with the smartphone app or a web browser, review the case and pictures, and respond with both structured and personalized advice to the health care professional at the point-of-care.

Conclusions

In this article, we present the design of the smartphone and the server application alongside the type of structured patient data collected together with the pictures taken at point-of-care. We report on how the application will be introduced at point-of-care and how its clinical impact will be evaluated prior to roll out. Challenges, strengths and limitations of the system are identified that may help materialising or hinder the expected outcome to provide a solution for remote consultation on burns that can be integrated into routine acute clinical care and thereby promote equity in injury emergency care, a growing public health burden.  相似文献   

10.
A common issue in bioinformatics is that computational methods often generate a large number of predictions sorted according to certain confidence scores. A key problem is then determining how many predictions must be selected to include most of the true predictions while maintaining reasonably high precision. In nuclear magnetic resonance (NMR)-based protein structure determination, for instance, computational peak picking methods are becoming more and more common, although expert-knowledge remains the method of choice to determine how many peaks among thousands of candidate peaks should be taken into consideration to capture the true peaks. Here, we propose a Benjamini-Hochberg (B-H)-based approach that automatically selects the number of peaks. We formulate the peak selection problem as a multiple testing problem. Given a candidate peak list sorted by either volumes or intensities, we first convert the peaks into -values and then apply the B-H-based algorithm to automatically select the number of peaks. The proposed approach is tested on the state-of-the-art peak picking methods, including WaVPeak [1] and PICKY [2]. Compared with the traditional fixed number-based approach, our approach returns significantly more true peaks. For instance, by combining WaVPeak or PICKY with the proposed method, the missing peak rates are on average reduced by 20% and 26%, respectively, in a benchmark set of 32 spectra extracted from eight proteins. The consensus of the B-H-selected peaks from both WaVPeak and PICKY achieves 88% recall and 83% precision, which significantly outperforms each individual method and the consensus method without using the B-H algorithm. The proposed method can be used as a standard procedure for any peak picking method and straightforwardly applied to some other prediction selection problems in bioinformatics. The source code, documentation and example data of the proposed method is available at http://sfb.kaust.edu.sa/pages/software.aspx.  相似文献   

11.
Bioinformatic research relies on large-scale computational infrastructures which have a nonzero carbon footprint but so far, no study has quantified the environmental costs of bioinformatic tools and commonly run analyses. In this work, we estimate the carbon footprint of bioinformatics (in kilograms of CO2 equivalent units, kgCO2e) using the freely available Green Algorithms calculator (www.green-algorithms.org, last accessed 2022). We assessed 1) bioinformatic approaches in genome-wide association studies (GWAS), RNA sequencing, genome assembly, metagenomics, phylogenetics, and molecular simulations, as well as 2) computation strategies, such as parallelization, CPU (central processing unit) versus GPU (graphics processing unit), cloud versus local computing infrastructure, and geography. In particular, we found that biobank-scale GWAS emitted substantial kgCO2e and simple software upgrades could make it greener, for example, upgrading from BOLT-LMM v1 to v2.3 reduced carbon footprint by 73%. Moreover, switching from the average data center to a more efficient one can reduce carbon footprint by approximately 34%. Memory over-allocation can also be a substantial contributor to an algorithm’s greenhouse gas emissions. The use of faster processors or greater parallelization reduces running time but can lead to greater carbon footprint. Finally, we provide guidance on how researchers can reduce power consumption and minimize kgCO2e. Overall, this work elucidates the carbon footprint of common analyses in bioinformatics and provides solutions which empower a move toward greener research.  相似文献   

12.
This paper analyzes the biochemical equilibria between bivalent receptors, homo-bifunctional ligands, monovalent inhibitors, and their complexes. Such reaction schemes arise in the immune response, where immunoglobulins (bivalent receptors) bind to pathogens or allergens. The equilibria may be described by an infinite system of algebraic equations, which accounts for complexes of arbitrary size n (n being the number of receptors present in the complex). The system can be reduced to just 3 algebraic equations for the concentrations of free (unbound) receptor, free ligand and free inhibitor. Concentrations of all other complexes can be written explicitly in terms of these variables. We analyze how concentrations of key (experimentally-measurable) quantities vary with system parameters. Such measured quantities can furnish important information about dissociation constants in the system, which are difficult to obtain by other means. We provide analytical expressions and suggest specific experiments that could be used to determine the dissociation constants.  相似文献   

13.

Background

Flux balance analysis (FBA) is a widely-used method for analyzing metabolic networks. However, most existing tools that implement FBA require downloading software and writing code. Furthermore, FBA generates predictions for metabolic networks with thousands of components, so meaningful changes in FBA solutions can be difficult to identify. These challenges make it difficult for beginners to learn how FBA works.

Results

To meet this need, we present Escher-FBA, a web application for interactive FBA simulations within a pathway visualization. Escher-FBA allows users to set flux bounds, knock out reactions, change objective functions, upload metabolic models, and generate high-quality figures without downloading software or writing code. We provide detailed instructions on how to use Escher-FBA to replicate several FBA simulations that generate real scientific hypotheses.

Conclusions

We designed Escher-FBA to be as intuitive as possible so that users can quickly and easily understand the core concepts of FBA. The web application can be accessed at https://sbrg.github.io/escher-fba.
  相似文献   

14.
Evolving in sync with the computation revolution over the past 30 years, computational biology has emerged as a mature scientific field. While the field has made major contributions toward improving scientific knowledge and human health, individual computational biology practitioners at various institutions often languish in career development. As optimistic biologists passionate about the future of our field, we propose solutions for both eager and reluctant individual scientists, institutions, publishers, funding agencies, and educators to fully embrace computational biology. We believe that in order to pave the way for the next generation of discoveries, we need to improve recognition for computational biologists and better align pathways of career success with pathways of scientific progress. With 10 outlined steps, we call on all adjacent fields to move away from the traditional individual, single-discipline investigator research model and embrace multidisciplinary, data-driven, team science.

Do you want to attract computational biologists to your project or to your department? Despite the major contributions of computational biology, those attempting to bridge the interdisciplinary gap often languish in career advancement, publication, and grant review. Here, sixteen computational biologists around the globe present "A field guide to cultivating computational biology," focusing on solutions.

Biology in the digital era requires computation and collaboration. A modern research project may include multiple model systems, use multiple assay technologies, collect varying data types, and require complex computational strategies, which together make effective design and execution difficult or impossible for any individual scientist. While some labs, institutions, funding bodies, publishers, and other educators have already embraced a team science model in computational biology and thrived [17], others who have not yet fully adopted it risk severely lagging behind the cutting edge. We propose a general solution: “deep integration” between biology and the computational sciences. Many different collaborative models can yield deep integration, and different problems require different approaches (Fig 1).Open in a separate windowFig 1Supporting interdisciplinary team science will accelerate biological discoveries.Scientists who have little exposure to different fields build silos, in which they perform science without external input. To solve hard problems and to extend your impact, collaborate with diverse scientists, communicate effectively, recognize the importance of core facilities, and embrace research parasitism. In biologically focused parasitism, wet lab biologists use existing computational tools to solve problems; in computationally focused parasitism, primarily dry lab biologists analyze publicly available data. Both strategies maximize the use and societal benefit of scientific data.In this article, we define computational science extremely broadly to include all quantitative approaches such as computer science, statistics, machine learning, and mathematics. We also define biology broadly, including any scientific inquiry pertaining to life and its many complications. A harmonious deep integration between biology and computer science requires action—we outline 10 immediate calls to action in this article and aim our speech directly at individual scientists, institutions, funding agencies, and publishers in an attempt to shift perspectives and enable action toward accepting and embracing computational biology as a mature, necessary, and inevitable discipline (Box 1).Box 1. Ten calls to action for individual scientists, funding bodies, publishers, and institutions to cultivate computational biology. Many actions require increased funding support, while others require a perspective shift. For those actions that require funding, we believe convincing the community of need is the first step toward agencies and systems allocating sufficient support
  1. Respect collaborators’ specific research interests and motivationsProblem: Researchers face conflicts when their goals do not align with collaborators. For example, projects with routine analyses provide little benefit for computational biologists.Solution: Explicit discussion about interests/expertise/goals at project onset.Opportunity: Clearly defined expectations identify gaps, provide commitment to mutual benefit.
  2. Seek necessary input during project design and throughout the project life cycleProblem: Modern research projects require multiple experts spanning the project’s complexity.Solution: Engage complementary scientists with necessary expertise throughout the entire project life cycle.Opportunity: Better designed and controlled studies with higher likelihood for success.
  3. Provide and preserve budgets for computational biologists’ workProblem: The perception that analysis is “free” leads to collaborator budget cuts.Solution: When budget cuts are necessary, ensure that they are spread evenly.Opportunity: More accurate, reproducible, and trustworthy computational analyses.
  4. Downplay publication author order as an evaluation metric for computational biologistsProblem: Computational biologist roles on publications are poorly understood and undervalued.Solution: Journals provide more equitable opportunities, funding bodies and institutions improve understanding of the importance of team science, scientists educate each other.Opportunity: Engage more computational biologist collaborators, provide opportunities for more high-impact work.
  5. Value software as an academic productProblem: Software is relatively undervalued and can end up poorly maintained and supported, wasting the time put into its creation.Solution: Scientists cite software, and funding bodies provide more software funding opportunities.Opportunity: More high-quality maintainable biology software will save time, reduce reimplementation, and increase analysis reproducibility.
  6. Establish academic structures and review panels that specifically reward team scienceProblem: Current mechanisms do not consistently reward multidisciplinary work.Solution: Separate evaluation structures to better align peer review to reward indicators of team science.Opportunity: More collaboration to attack complex multidisciplinary problems.
  7. Develop and reward cross-disciplinary training and mentoringProblem: Academic labs and institutions are often insufficiently equipped to provide training to tackle the next generation of biological problems, which require computational skills.Solution: Create better training programs aligned to necessary on-the-job skills with an emphasis on communication, encourage wet/dry co-mentorship, and engage younger students to pursue computational biology.Opportunity: Interdisciplinary students uncover important insights in their own data.
  8. Support computing and experimental infrastructure to empower computational biologistsProblem: Individual computational labs often fund suboptimal cluster computing systems and lack access to data generation facilities.Solution: Institutions can support centralized compute and engage core facilities to provide data services.Opportunity: Time and cost savings for often overlooked administrative tasks.
  9. Provide incentives and mechanisms to share open data to empower discovery through reanalysisProblem: Data are often siloed and have untapped potential.Solution: Provide institutional data storage with standardized identifiers and provide separate funding mechanisms and publishing venues for data reuse.Opportunity: Foster new breed of researchers, “research parasites,” who will integrate multimodal data and enhance mechanistic insights.
  10. Consider infrastructural, ethical, and cultural barriers to clinical data accessProblem: Identifiable health data, which include sensitive information that must be kept hidden, are distributed and disorganized, and thus underutilized.Solution: Leadership must enforce policies to share deidentifiable data with interoperable metadata identifiers.Opportunity: Derive new insights from multimodal data integration and build datasets with increased power to make biological discoveries.
  相似文献   

15.
The Zika virus outbreak in the Americas has caused global concern. To help accelerate this fight against Zika, we launched the OpenZika project. OpenZika is an IBM World Community Grid Project that uses distributed computing on millions of computers and Android devices to run docking experiments, in order to dock tens of millions of drug-like compounds against crystal structures and homology models of Zika proteins (and other related flavivirus targets). This will enable the identification of new candidates that can then be tested in vitro, to advance the discovery and development of new antiviral drugs against the Zika virus. The docking data is being made openly accessible so that all members of the global research community can use it to further advance drug discovery studies against Zika and other related flaviviruses.The Zika virus (ZIKV) has emerged as a major public health threat to the Americas as of 2015 [1]. We have previously suggested that it represents an opportunity for scientific collaboration and open scientific exchange [2]. The health of future generations may very well depend on the decisions we make, our willingness to share our findings quickly, and open collaboration to rapidly find a cure for this disease. Since February 1, 2016, when the World Health Organization deemed the cluster of microcephaly cases, Guillain-Barré, and other neurological disorders associated with ZIKV in Latin America and the Caribbean as constituting a Public Health Emergency of International Concern [3] (PHEIC), we have seen a rapid increase in publications (S1 References and main references). We [2] and others [4,5] described steps that could be taken to initiate a drug discovery program on ZIKV. For example, computational approaches, such as virtual screening of chemical libraries or focused screening to repurpose FDA and/or EU-approved drugs, can be used to help accelerate the discovery of an anti-ZIKV drug. An antiviral drug discovery program can be initiated using structure-based design, based on homology models of the key ZIKV proteins. With the lack of structural information regarding the proteins of ZIKV, we built homology models for all the ZIKV proteins, based on close homologs such as dengue virus, using freely available software [6] (S1 Table). These were made available online on March 3, 2016. We also predicted the site of glycosylation of glycoprotein E as Asn154, which was recently experimentally verified [7].Since the end of March 2016, we have now seen two cryo-EM structures and 16 crystal structures of five target classes (S1 Table). These structures, alongside the homology models, represent potential starting points for docking-based virtual screening campaigns to help find molecules that are predicted to have high affinity with ZIKV proteins. These predictions can then be tested against the virus in cell-based assays and/or using individual protein-based assays. There are millions of molecules available that can be assayed, but which ones are likely to work, and how should we prioritize them?In March, we initiated a new open collaborative project called OpenZika (Fig 1), with IBM’s World Community Grid (WCG, worldcommunitygrid.org), which has been used previously for distributed computing projects (S2 Table). On May 18, 2016, the OpenZika project began the virtual screening of ~6 million compounds that are in the ZINC database (Fig 1), as well as the FDA-approved drugs and the NIH clinical collection, using AutoDock Vina and the homology models and crystal structures (S1 Table, S1 Text, S1 References), to discover novel candidate compounds that can potentially be developed into new drugs for treating ZIKV. These will be followed by additional virtual screens with a new ZINC library of ~38 million compounds, and the PubChem database (at most ~90 million compounds), after their structures are prepared for docking.Open in a separate windowFig 1Workflow for the OpenZika project.A. Docking input files of the targets and ligands are prepared, and positive control docking studies are performed. The crystallographic binding mode of a known inhibitor is shown as sticks with dark purple carbon atoms, while the docked binding mode against the NS5 target from HCV has cyan carbons. Our pdbqt files of the libraries of compounds we screen are also openly accessible (http://zinc.docking.org/pdbqt/). B. We have already prepared the docking input files for ~6 million compounds from ZINC (i.e., the libraries that ALP previously used in the GO Fight Against Malaria project on World Community Grid), which are currently being used in the initial set of virtual screens on OpenZika. C. IBM’s World Community Grid is an internet-distributed network of millions of computers (Mac, Windows, and Linux) and Android-based tablets or smartphones in over 80 countries. Over 715,000 volunteers donate their dormant computer time (that would otherwise be wasted) towards different projects that are both (a) run by an academic or nonprofit research institute, and (b) are devoted to benefiting humanity. D. OpenZika is harnessing World Community Grid to dock millions of commercially available compounds against multiple ZIKV homology models and crystal structures (and targets from related viruses) using AutoDock Vina (AD Vina). This ultimately produces candidates (virtual hits that produced the best docking scores and displayed the best interactions with the target during visual inspection) against individual proteins, which can then be prioritized for in vitro testing by collaborators. After it is inspected, all computational data against ZIKV targets will be made open to the public on our website (http://openzika.ufg.br/experiments/#tab-id-7), and OpenZika results are also available upon request. The computational and experimental data produced will be published as quickly as possible.Initially, compounds are being screened against the ZIKV homologs of drug targets that have been well-validated in research against dengue and hepatitis C viruses, such as NS5 and Glycoprotein E (S1 Table, S1 Text, S1 References). These may allow us to identify broad-spectrum antivirals against multiple flaviviruses, such as dengue virus, West Nile virus, and yellow fever virus. In addition, docking against the crystal structure of a related protein from a different pathogen can sometimes discover novel hits against the pathogen of interest [8].As well as applying docking-based filters, the compounds virtually screened on OpenZika will also be filtered using machine learning models (S1 Text, S1 References). These should be useful selection criteria for subsequent tests by our collaborators in whole-cell ZIKV assays, to verify their antiviral activity for blocking ZIKV infection or replication. Since all OpenZika docking data will be in the public domain soon after they are completed and verified, we and other labs can then advance the development of some of these new virtual candidates into experimentally validated hits, leads, and drugs through collaborations with wet labs.This exemplifies open science, which should help scientists around the world as they address the long and arduous process of discovering and developing new drugs. Screening millions of compounds against many different protein models in this way would take far more resources and time than any academic researcher could generally obtain or spend. As of August 16, 2016, we have submitted 894 million docking jobs. Over 6,934 CPU years have been donated to us, enabling over 439 million different docking jobs. We recently selected an initial batch of candidates for NS3 helicase (data openly available at http://openzika.ufg.br/experiments/#tab-id-7), for in vitro testing. Without the unique community of volunteers and tremendous resources provided by World Community Grid, this project would have been very difficult to initiate in a reasonable time frame at this scale.The OpenZika project will ultimately generate several billion docking results, which could make it the largest computational drug discovery project ever performed in academia. The potential challenges we foresee will be finding laboratories with sufficient funding to pursue compounds, synthesize analogs, and develop target-based assays to validate our predictions and generate SAR (Structure-Activity Relationship) data to guide the process of developing the new hits into leads and then drugs. Due to the difficult nature of drug discovery and the eventual evolution of drug resistance, funding of ZIKV research once initiated will likely need to be sustained for several years, if not longer (e.g., HIV research has been funded for decades). As with other WCG projects, once scientists identify experimentally validated leads, finding a company to license them and pursue them in clinical trials and beyond will need incentives such as the FDA Tropical Disease Priority voucher, [9] which has a financial value on the open market [10].By working together and opening our research to the scientific community, many other labs will also be able to take promising molecular candidates forward to accelerate progress towards defeating the ZIKV outbreak. We invite any interested researcher to join us (send us your models or volunteer to assay the candidates we identify through this effort against any of the flaviviruses), and we hope new volunteers in the general public will donate their dormant, spare computing cycles to this cause. We will ultimately report the full computational and experimental results of this collaboration.

Advantages and Disadvantages of OpenZika

Advantages
  • Open Science could accelerate the discovery of new antivirals using docking and virtual screening
  • Docking narrows down compounds to test, which saves time and money
  • Free to use distributed computing on World Community Grid, and the workflow is simpler than using conventional supercomputers
Disadvantages
  • Concern around intellectual property ownership and whether companies will develop drugs coming from effort
  • Need for experimental assays will always be a factor
  • Testing in vitro and in vivo is not free, nor are the samples of the compounds
  相似文献   

16.
The range of heterogeneous approaches available for quantifying protein abundance via mass spectrometry (MS)1 leads to considerable challenges in modeling, archiving, exchanging, or submitting experimental data sets as supplemental material to journals. To date, there has been no widely accepted format for capturing the evidence trail of how quantitative analysis has been performed by software, for transferring data between software packages, or for submitting to public databases. In the context of the Proteomics Standards Initiative, we have developed the mzQuantML data standard. The standard can represent quantitative data about regions in two-dimensional retention time versus mass/charge space (called features), peptides, and proteins and protein groups (where there is ambiguity regarding peptide-to-protein inference), and it offers limited support for small molecule (metabolomic) data. The format has structures for representing replicate MS runs, grouping of replicates (for example, as study variables), and capturing the parameters used by software packages to arrive at these values. The format has the capability to reference other standards such as mzML and mzIdentML, and thus the evidence trail for the MS workflow as a whole can now be described. Several software implementations are available, and we encourage other bioinformatics groups to use mzQuantML as an input, internal, or output format for quantitative software and for structuring local repositories. All project resources are available in the public domain from the HUPO Proteomics Standards Initiative http://www.psidev.info/mzquantml.The Proteomics Standards Initiative (PSI) has been working for ten years to improve the reporting and standardization of proteomics data. The PSI has published minimum reporting guidelines, called MIAPE (Minimum Information about a Proteomics Experiment) documents, for MS-based proteomics (1) and molecular interactions (2), as well as data standards for raw/processed MS data in mzML (3), peptide and protein identifications in mzIdentML (4), transitions for selected reaction monitoring analysis in TraML (5), and molecular interactions in PSI-MI format (6). Standards are particularly important for quantitative proteomics research, because the associated bioinformatics analysis is highly challenging as a result of the range of different experimental techniques for deriving abundance values for proteins using MS. The techniques can be broadly divided into those based on (i) differential labeling, in which a metabolic label or chemical tag is applied to cells, peptides, or proteins, samples are mixed, and intensity signals for peptide ions are compared within single MS runs; or (ii) label-free methods in which MS runs occur in parallel and bioinformatics methods are used to extract intensity signals, ensuring that like-for-like signals are compared between runs (7). In most label-based and label-free approaches, peptide ratios or abundance values must be summarized in order for one to arrive at relative protein abundance values, taking into account ambiguity in peptide-to-protein inference. Absolute protein abundance values can typically be derived only using internal standards spiked into samples of known abundance (8, 9). The PSI has recently developed a MIAPE-Quant document defining and describing the minimal information necessary in order to judge or repeat a quantitative proteomics experiment.Software packages tend to report peptide or protein abundance values in a bespoke format, often as tab or comma separated values, for import into spreadsheet software. In complementary work, the PSI has developed a standard format for capturing these final results in a standardized tab separated value format, called mzTab, suitable for post-processing and visualization in end-user tools such as Microsoft Excel or the R programming language. The final results of a quantitative analysis are sufficient for many purposes, such as performing statistical analysis to determine differential expression or cluster analysis to find co-expressed proteins. However, mzTab (or similar bespoke formats) was not designed to hold a trace of how the peptide and protein abundance values were calculated from MS data (i.e. metadata is lost that might be crucial for other tasks). For example, most quantitative software packages detect and quantify so-called “features” (representing all ions collected for a given peptide) in two-dimensional MS data, where the two dimensions are retention time from liquid chromatography (LC) and mass over charge (m/z). Without capturing the two-dimensional coordinates of the features, it is not possible to write visualization software showing exactly what the software has quantified; researchers have to trust that the software has accurately quantified all ions from isotopes of a given peptide, excluding any overlapping ions derived from other peptides. The history of proteomics research has been one in which studies of highly variable quality have been published. There is also little quality control or benchmarking performed on quantitative software (10), meaning it is difficult to make quality judgments on a set of peptide and protein abundance values. The PSI has recently developed mzML, which can capture raw or processed MS data in a vendor neutral format, and the mzIdentML standard, to capture search engine results and the important metadata (such as software parameters), such that peptide and protein identification data can be interpreted consistently. These two standards are now being used for data sharing and to support open source software development, so that informatics groups can focus on algorithmic development rather than file format conversions. Until now, there has been no widely used open source format or data standard for capturing metadata and data relating to the quantitation step of analysis pipelines. In this work, we report the mzQuantML standard from the PSI, which has recently completed the PSI standardization process (11), from which version 1.0 was released. We believe that quantitative proteomics research will benefit from improved capabilities for tracing what manipulations have happened to data at each stage of the analysis process. The mzQuantML standard has been designed to store quantitative values calculated for features, peptides, proteins, and/or protein groups (where there is ambiguity in protein inference), plus associated software parameters. It has also been designed to accommodate small molecule data to improve interoperability with metabolomics investigations. The format can represent experimental replicates and grouping of replicates, and it has been designed via an open and transparent process.  相似文献   

17.
Conclusion Of the batches of dye examined only one (Azure B, Merck number YE 132) can be regarded as being of acceptable quality. This material retails at $6.60/g for a 250 g lot, and may be considered to be reasonably priced. The quality of all Serva's pure dyes is alarmingly poor and leaves much to be desired. In view of the fact that these dyes retail at $700.00/g, the purchaser should expect to obtain analytically pure material. It is hoped that this note may persuade users of commercially available pure Azure dyes to check the purity of any samples they might possess: manufacturers' claims should not be accepted at face value.  相似文献   

18.
19.

Background

Metal ions play a critical role in the stabilization of RNA structures. Therefore, accurate prediction of the ion effects in RNA folding can have a far-reaching impact on our understanding of RNA structure and function. Multivalent ions, especially Mg2+, are essential for RNA tertiary structure formation. These ions can possibly become strongly correlated in the close vicinity of RNA surface. Most of the currently available software packages, which have widespread success in predicting ion effects in biomolecular systems, however, do not explicitly account for the ion correlation effect. Therefore, it is important to develop a software package/web server for the prediction of ion electrostatics in RNA folding by including ion correlation effects.

Results

The TBI web server http://rna.physics.missouri.edu/tbi_index.html provides predictions for the total electrostatic free energy, the different free energy components, and the mean number and the most probable distributions of the bound ions. A novel feature of the TBI server is its ability to account for ion correlation and ion distribution fluctuation effects.

Conclusions

By accounting for the ion correlation and fluctuation effects, the TBI server is a unique online tool for computing ion-mediated electrostatic properties for given RNA structures. The results can provide important data for in-depth analysis for ion effects in RNA folding including the ion-dependence of folding stability, ion uptake in the folding process, and the interplay between the different energetic components.  相似文献   

20.

Background

The segment overlap score (SOV) has been used to evaluate the predicted protein secondary structures, a sequence composed of helix (H), strand (E), and coil (C), by comparing it with the native or reference secondary structures, another sequence of H, E, and C. SOV’s advantage is that it can consider the size of continuous overlapping segments and assign extra allowance to longer continuous overlapping segments instead of only judging from the percentage of overlapping individual positions as Q3 score does. However, we have found a drawback from its previous definition, that is, it cannot ensure increasing allowance assignment when more residues in a segment are further predicted accurately.

Results

A new way of assigning allowance has been designed, which keeps all the advantages of the previous SOV score definitions and ensures that the amount of allowance assigned is incremental when more elements in a segment are predicted accurately. Furthermore, our improved SOV has achieved a higher correlation with the quality of protein models measured by GDT-TS score and TM-score, indicating its better abilities to evaluate tertiary structure quality at the secondary structure level. We analyzed the statistical significance of SOV scores and found the threshold values for distinguishing two protein structures (SOV_refine  > 0.19) and indicating whether two proteins are under the same CATH fold (SOV_refine > 0.94 and > 0.90 for three- and eight-state secondary structures respectively). We provided another two example applications, which are when used as a machine learning feature for protein model quality assessment and comparing different definitions of topologically associating domains. We proved that our newly defined SOV score resulted in better performance.

Conclusions

The SOV score can be widely used in bioinformatics research and other fields that need to compare two sequences of letters in which continuous segments have important meanings. We also generalized the previous SOV definitions so that it can work for sequences composed of more than three states (e.g., it can work for the eight-state definition of protein secondary structures). A standalone software package has been implemented in Perl with source code released. The software can be downloaded from http://dna.cs.miami.edu/SOV/.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号