首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
An algorithm is presented that returns the optimal pairwise gapped alignment of two sets of signed numerical sequence values. One distinguishing feature of this algorithm is a flexible comparison engine (based on both relative shape and absolute similarity measures) that does not rely on explicit gap penalties. Additionally, an empirical probability model is developed to estimate the significance of the returned alignment with respect to randomized data. The algorithm''s utility for biological hypothesis formulation is demonstrated with test cases including database search and pairwise alignment of protein hydropathy. However, the algorithm and probability model could possibly be extended to accommodate other diverse types of protein or nucleic acid data, including positional thermodynamic stability and mRNA translation efficiency. The algorithm requires only numerical values as input and will readily compare data other than protein hydropathy. The tool is therefore expected to complement, rather than replace, existing sequence and structure based tools and may inform medical discovery, as exemplified by proposed similarity between a chlamydial ORFan protein and bacterial colicin pore-forming domain. The source code, documentation, and a basic web-server application are available.  相似文献   

2.
The Solid phase synthesis (SPS) concept, first developed for biopolymers, has spread in every field where organic synthesis is involved. While the potential of the solid-phase method was obvious in 1959 to its discoverer, Prof. R. B. Merrifield, it was unpredictable its dominance in peptide synthesis and especially in combinatorial chemistry, an area not yet conceived. SPS paved the way for solid-phase combinatorial approaches (extensively reviewed in (Choong, I. C. and Ellman, J. A.: 1996, Annu. Rep. Med. Chem. 31, 309–318; Obrecht, D. and Villalgordo, J. M.: 1998, Solid-supported Combinatorial and Parallel Synthesis of Small-Molecular-Weight Compound Libraries. Pergamon Press Ltd., Oxford, UK; Chabala, J. C.: 1995, Curr. Opin. Biotechnol. 6, 632–639; Kamal, A., Reddy, K. L., Devaiah, V., Shankaraiah, N., Reddy, D. R.: 2006, Mini Rev. Med. Chem. 6, 53–69; Whitehead, D. M., McKeown, S. C., Routledge, A.: 2005, Comb. Chem. HTS 8, 361–371; Nefzi, A., Ostresh, J. M., Houghten, R. A.: 1997, Chem. Rev. 97, 449–472; Gordon, E. M., Gallop, M. A., Patel, D. V.: 1996, Acc. Chem. Res. 29, 144–154)) as many laboratories and companies focused on the development of technologies and chemistry suitable to this new methodology. This resulted in the spectacular outburst of combinatorial chemistry, which profoundly changed the approach for new drug discovery. Combinatorial chemistry is currently considered a valid approach for a wide range of biomedical applications, such as, target validation and drug discovery.  相似文献   

3.
4.
Protein-protein interactions (PPIs) are the basis of biological functions. Knowledge of the interactions of a protein can help understand its molecular function and its association with different biological processes and pathways. Several publicly available databases provide comprehensive information about individual proteins, such as their sequence, structure, and function. There also exist databases that are built exclusively to provide PPIs by curating them from published literature. The information provided in these web resources is protein-centric, and not PPI-centric. The PPIs are typically provided as lists of interactions of a given gene with links to interacting partners; they do not present a comprehensive view of the nature of both the proteins involved in the interactions. A web database that allows search and retrieval based on biomedical characteristics of PPIs is lacking, and is needed. We present Wiki-Pi (read Wiki-π), a web-based interface to a database of human PPIs, which allows users to retrieve interactions by their biomedical attributes such as their association to diseases, pathways, drugs and biological functions. Each retrieved PPI is shown with annotations of both of the participant proteins side-by-side, creating a basis to hypothesize the biological function facilitated by the interaction. Conceptually, it is a search engine for PPIs analogous to PubMed for scientific literature. Its usefulness in generating novel scientific hypotheses is demonstrated through the study of IGSF21, a little-known gene that was recently identified to be associated with diabetic retinopathy. Using Wiki-Pi, we infer that its association to diabetic retinopathy may be mediated through its interactions with the genes HSPB1, KRAS, TMSB4X and DGKD, and that it may be involved in cellular response to external stimuli, cytoskeletal organization and regulation of molecular activity. The website also provides a wiki-like capability allowing users to describe or discuss an interaction. Wiki-Pi is available publicly and freely at http://severus.dbmi.pitt.edu/wiki-pi/.  相似文献   

5.
The Protein Data Bank (PDB) is the worldwide repository of 3D structures of proteins, nucleic acids and complex assemblies. The PDB’s large corpus of data (> 100,000 structures) and related citations provide a well-organized and extensive test set for developing and understanding data citation and access metrics. In this paper, we present a systematic investigation of how authors cite PDB as a data repository. We describe a novel metric based on information cascade constructed by exploring the citation network to measure influence between competing works and apply that to analyze different data citation practices to PDB. Based on this new metric, we found that the original publication of RCSB PDB in the year 2000 continues to attract most citations though many follow-up updates were published. None of these follow-up publications by members of the wwPDB organization can compete with the original publication in terms of citations and influence. Meanwhile, authors increasingly choose to use URLs of PDB in the text instead of citing PDB papers, leading to disruption of the growth of the literature citations. A comparison of data usage statistics and paper citations shows that PDB Web access is highly correlated with URL mentions in the text. The results reveal the trend of how authors cite a biomedical data repository and may provide useful insight of how to measure the impact of a data repository.  相似文献   

6.

Background & Objective

Managing data from large-scale projects (such as The Cancer Genome Atlas (TCGA)) for further analysis is an important and time consuming step for research projects. Several efforts, such as the Firehose project, make TCGA pre-processed data publicly available via web services and data portals, but this information must be managed, downloaded and prepared for subsequent steps. We have developed an open source and extensible R based data client for pre-processed data from the Firehouse, and demonstrate its use with sample case studies. Results show that our RTCGAToolbox can facilitate data management for researchers interested in working with TCGA data. The RTCGAToolbox can also be integrated with other analysis pipelines for further data processing.

Availability and implementation

The RTCGAToolbox is open-source and licensed under the GNU General Public License Version 2.0. All documentation and source code for RTCGAToolbox is freely available at http://mksamur.github.io/RTCGAToolbox/ for Linux and Mac OS X operating systems.  相似文献   

7.
8.

Background

Graphical representation of data is one of the most easily comprehended forms of explanation. The current study describes a simple visualization tool which may allow greater understanding of medical and epidemiological data.

Method

We propose a simple tool for visualization of data, known as a “quilt plot”, that provides an alternative to presenting large volumes of data as frequency tables. Data from the Australian Needle and Syringe Program survey are used to illustrate “quilt plots”.

Conclusion

Visualization of large volumes of data using “quilt plots” enhances interpretation of medical and epidemiological data. Such intuitive presentations are particularly useful for the rapid assessment of problems in the data which cannot be readily identified by manual review. We recommend that, where possible, “quilt plots” be used along with traditional quantitative assessments of the data as an explanatory data analysis tool.  相似文献   

9.
In this issue of The Journal, an article by Schalkwyk et al.1 shows the landscape of allele-specific DNA methylation (ASM) in the human genome. ASM has long been studied as a hallmark of imprinted genes, and a chromosome-wide version of this phenomenon occurs, in a random fashion, during X chromosome inactivation in female cells. But the type of ASM motivating the study by Schalkwyk et al. is different. They used a high-resolution, methylation-sensitive SNP array (MSNP) method for genome-wide profiling of ASM in total peripheral-blood leukocytes (PBL) and buccal cells from a series of monozygotic twin pairs. Their data bring a new level of detail to our knowledge of a newly recognized phenomenon—nonimprinted, sequence-dependent ASM. They document the widespread occurrence of this phenomenon among human genes and discuss its basic implications for gene regulation and genetic-epigenetic interactions. But this paper and recent work from other laboratories2,3 raises the possibility of a more immediate and practical application for ASM mapping, namely to help extract maximum information from genome-wide association studies.  相似文献   

10.
Activation of CD4+ T cells requires the recognition of peptides that are presented by HLA class II molecules and can be assessed experimentally using the ELISpot assay. However, even given an individual’s HLA class II genotype, identifying which class II molecule is responsible for a positive ELISpot response to a given peptide is not trivial. The two main difficulties are the number of HLA class II molecules that can potentially be formed in a single individual (3–14) and the lack of clear peptide binding motifs for class II molecules. Here, we present a Bayesian framework to interpret ELISpot data (BIITE: Bayesian Immunogenicity Inference Tool for ELISpot); specifically BIITE identifies which HLA-II:peptide combination(s) are immunogenic based on cohort ELISpot data. We apply BIITE to two ELISpot datasets and explore the expected performance using simulations. We show this method can reach high accuracies, depending on the cohort size and the success rate of the ELISpot assay within the cohort.  相似文献   

11.
Viruses within a family often vary in their cellular tropism and pathogenicity. In many cases, these variations are due to viruses switching their specificity from one cell surface receptor to another. The structural requirements that underlie such receptor switching are not well understood especially for carbohydrate-binding viruses, as methods capable of structure-specificity studies are only relatively recently being developed for carbohydrates. We have characterized the receptor specificity, structure and infectivity of the human polyomavirus BKPyV, the causative agent of polyomavirus-associated nephropathy, and uncover a molecular switch for binding different carbohydrate receptors. We show that the b-series gangliosides GD3, GD2, GD1b and GT1b all can serve as receptors for BKPyV. The crystal structure of the BKPyV capsid protein VP1 in complex with GD3 reveals contacts with two sialic acid moieties in the receptor, providing a basis for the observed specificity. Comparison with the structure of simian virus 40 (SV40) VP1 bound to ganglioside GM1 identifies the amino acid at position 68 as a determinant of specificity. Mutation of this residue from lysine in BKPyV to serine in SV40 switches the receptor specificity of BKPyV from GD3 to GM1 both in vitro and in cell culture. Our findings highlight the plasticity of viral receptor binding sites and form a template to retarget viruses to different receptors and cell types.  相似文献   

12.
Tunnels and channels facilitate the transport of small molecules, ions and water solvent in a large variety of proteins. Characteristics of individual transport pathways, including their geometry, physico-chemical properties and dynamics are instrumental for understanding of structure-function relationships of these proteins, for the design of new inhibitors and construction of improved biocatalysts. CAVER is a software tool widely used for the identification and characterization of transport pathways in static macromolecular structures. Herein we present a new version of CAVER enabling automatic analysis of tunnels and channels in large ensembles of protein conformations. CAVER 3.0 implements new algorithms for the calculation and clustering of pathways. A trajectory from a molecular dynamics simulation serves as the typical input, while detailed characteristics and summary statistics of the time evolution of individual pathways are provided in the outputs. To illustrate the capabilities of CAVER 3.0, the tool was applied for the analysis of molecular dynamics simulation of the microbial enzyme haloalkane dehalogenase DhaA. CAVER 3.0 safely identified and reliably estimated the importance of all previously published DhaA tunnels, including the tunnels closed in DhaA crystal structures. Obtained results clearly demonstrate that analysis of molecular dynamics simulation is essential for the estimation of pathway characteristics and elucidation of the structural basis of the tunnel gating. CAVER 3.0 paves the way for the study of important biochemical phenomena in the area of molecular transport, molecular recognition and enzymatic catalysis. The software is freely available as a multiplatform command-line application at http://www.caver.cz.
This is a PLOS Computational Biology Software Article
  相似文献   

13.
14.
15.
16.

Background

High-throughput RNA interference (RNAi) screening has become a widely used approach to elucidating gene functions. However, analysis and annotation of large data sets generated from these screens has been a challenge for researchers without a programming background. Over the years, numerous data analysis methods were produced for plate quality control and hit selection and implemented by a few open-access software packages. Recently, strictly standardized mean difference (SSMD) has become a widely used method for RNAi screening analysis mainly due to its better control of false negative and false positive rates and its ability to quantify RNAi effects with a statistical basis. We have developed GUItars to enable researchers without a programming background to use SSMD as both a plate quality and a hit selection metric to analyze large data sets.

Results

The software is accompanied by an intuitive graphical user interface for easy and rapid analysis workflow. SSMD analysis methods have been provided to the users along with traditionally-used z-score, normalized percent activity, and t-test methods for hit selection. GUItars is capable of analyzing large-scale data sets from screens with or without replicates. The software is designed to automatically generate and save numerous graphical outputs known to be among the most informative high-throughput data visualization tools capturing plate-wise and screen-wise performances. Graphical outputs are also written in HTML format for easy access, and a comprehensive summary of screening results is written into tab-delimited output files.

Conclusion

With GUItars, we demonstrated robust SSMD-based analysis workflow on a 3840-gene small interfering RNA (siRNA) library and identified 200 siRNAs that increased and 150 siRNAs that decreased the assay activities with moderate to stronger effects. GUItars enables rapid analysis and illustration of data from large- or small-scale RNAi screens using SSMD and other traditional analysis methods. The software is freely available at http://sourceforge.net/projects/guitars/.  相似文献   

17.
Steric-block ON analogues are efficient inhibitors of RNA-protein interaction and therefore have potential to probe RNA sequences for putative protein binding sites and to investigate mechanisms of protein binding. The packaging process of HIV-1 is highly specific involving an interaction between the Gag protein and a conserved sequence that is only present on genomic viral RNA. Using oligonucleotide probes we have confirmed that the terminal purine loop is the major Gag binding site on SL3 and that a secondary Gag binding site exists at an internal purine bulge. We also demonstrate direct binding of oligonucleotide to their binding sites and confirm this interaction does not alter global RNA conformation, making them highly specific, nondisruptive probes of RNA protein interactions.  相似文献   

18.
Trinucleotide repeats are common within gene coding regions and could serve as beacons to locate genes. Five of the most common trinucleotide repeats in an Actinidia (kiwifruit) expressed sequence tag (EST) database were found to be (ACC)4, (CAC)4, (CCA)4, (CTC)4, and (TGG)4. These repeats, with or without an artificial 5′-end tail, were tested by vectorette PCR against genomic DNA from Actinidia chinensis. Eighty-nine randomly selected clones showed an average insert size of 383 bp, with a maximum of 1,151 bp and a minimum of 78 bp. Two-thirds of the clones contained the artificial tail attached to the trinucleotide, showing a slight advantage of possessing such a tail during annealing and amplification. The sequences were searched against the Actinidia EST database and GenBank. Of the 89 clones, 33 had a significant hit (expect value < e−15). Twenty-four of those clones matched an Actinidia EST. Twenty-one clones contained one or more simple sequence repeats. This methodology can be applied by conventional cloning and sequencing methods or by high throughput pyrosequencing technologies to develop genetic markers and also for gene mining in species with little or no genetic/genomic resources.  相似文献   

19.
Förster resonance energy transfer (FRET) is an exquisitely sensitive method for detection of molecular interactions and conformational changes in living cells. The recent advent of fluorescence imaging technology with single-molecule (or molecular-complex) sensitivity, together with refinements in the kinetic theory of FRET, provide the necessary tool kits for determining the stoichiometry and relative disposition of the protomers within protein complexes (i.e., quaternary structure) of membrane receptors and transporters in living cells. In contrast to standard average-based methods, this method relies on the analysis of distributions of apparent FRET efficiencies, Eapp, across the image pixels of individual cells expressing proteins of interest. The most probable quaternary structure of the complex is identified from the number of peaks in the Eapp distribution and their dependence on a single parameter, termed pairwise FRET efficiency. Such peaks collectively create a unique FRET spectrum corresponding to each oligomeric configuration of the protein. Therefore, FRET could quite literally become a spectrometric method—akin to that of mass spectrometry—for sorting protein complexes according to their size and shape.  相似文献   

20.
Mutations in the NPHS2 gene are a major cause of steroid-resistant nephrotic syndrome, a severe human kidney disorder. The NPHS2 gene product podocin is a key component of the slit diaphragm cell junction at the kidney filtration barrier and part of a multiprotein-lipid supercomplex. A similar complex with the podocin ortholog MEC-2 is required for touch sensation in Caenorhabditis elegans. Although podocin and MEC-2 are membrane-associated proteins with a predicted hairpin-like structure and amino and carboxyl termini facing the cytoplasm, this membrane topology has not been convincingly confirmed. One particular mutation that causes kidney disease in humans (podocinP118L) has also been identified in C. elegans in genetic screens for touch insensitivity (MEC-2P134S). Here we show that both mutant proteins, in contrast to the wild-type variants, are N-glycosylated because of the fact that the mutant C termini project extracellularly. PodocinP118L and MEC-2P134S did not fractionate in detergent-resistant membrane domains. Moreover, mutant podocin failed to activate the ion channel TRPC6, which is part of the multiprotein-lipid supercomplex, indicative of the fact that cholesterol recruitment to the ion channels, an intrinsic function of both proteins, requires C termini facing the cytoplasmic leaflet of the plasma membrane. Taken together, this study demonstrates that the carboxyl terminus of podocin/MEC-2 has to be placed at the inner leaflet of the plasma membrane to mediate cholesterol binding and contribute to ion channel activity, a prerequisite for mechanosensation and the integrity of the kidney filtration barrier.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号