Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein backbone conformation as a 1D sequence string. A progressive alignment strategy similar to CLUSTALW was adopted for multiple PB sequence alignment (mulPBA). Highly similar stretches identified by the pairwise alignments are given higher weights during the alignment. The residue equivalences from PB based alignments are used to obtain a three dimensional fit of the structures followed by an iterative refinement of the structural superposition. Systematic comparisons using benchmark datasets of MSTAs underlines that the alignment quality is better than MULTIPROT, MUSTANG and the alignments in HOMSTRAD, in more than 85% of the cases. Comparison with other rigid-body and flexible MSTAs also indicate that mulPBA alignments are superior to most of the rigid-body MSTAs and highly comparable to the flexible alignment methods. 相似文献
Recent studies have demonstrated that cell cycle plays a central role in development and carcinogenesis. Thus, the use of big databases and genome-wide high-throughput data to unravel the genetic and epigenetic mechanisms underlying cell cycle progression in stem cells and cancer cells is a matter of considerable interest.
Real genetic-and-epigenetic cell cycle networks (GECNs) of embryonic stem cells (ESCs) and HeLa cancer cells were constructed by applying system modeling, system identification, and big database mining to genome-wide next-generation sequencing data. Real GECNs were then reduced to core GECNs of HeLa cells and ESCs by applying principal genome-wide network projection. In this study, we investigated potential carcinogenic and stemness mechanisms for systems cancer drug design by identifying common core and specific GECNs between HeLa cells and ESCs. Integrating drug database information with the specific GECNs of HeLa cells could lead to identification of multiple drugs for cervical cancer treatment with minimal side-effects on the genes in the common core. We found that dysregulation of miR-29C, miR-34A, miR-98, and miR-215; and methylation of ANKRD1, ARID5B, CDCA2, PIF1, STAMBPL1, TROAP, ZNF165, and HIST1H2AJ in HeLa cells could result in cell proliferation and anti-apoptosis through NFκB, TGF-β, and PI3K pathways. We also identified 3 drugs, methotrexate, quercetin, and mimosine, which repressed the activated cell cycle genes, ARID5B, STK17B, and CCL2, in HeLa cells with minimal side-effects. 相似文献
Peptide mass fingerprinting (PMF) has over the years become one of the most commonly used tools for high-throughput analysis and identification of proteins. This method is applicable when relatively simple samples have to be analysed and it is commonly used for analysing proteins previously separated by 2-DE. The most common type of instrument used for this approach is the MALDI-TOF that has proved to be particularly suitable for the PMF analysis because of its characteristics of speed, robustness, sensitivity and automation. We have used a MALDI-TOF equipped with a novel parallel PSD capability (MALDI micro MX), to perform the analysis of two sets of different biological samples isolated by 2-DE. By using a method that integrates the data obtained by PMF analysis with the PSD data obtained in the same experiment, we show that the new multiplexed PSD solution increases the protein identification rate compared to the normal PMF approach. We also investigated the use of a charge-directed fragmentation modification reagent to improve the identification rate and confidence levels. 相似文献
The status of wetland inventory and classification is considered for 44 European countries, as well as for the continent as a whole. Data and information were obtained from questionnaires compiled by the International Waterfowl and Wetland Research Bureau, the MedWet sub-project on inventory and monitoring, and the Ramsar Bureau. Nine European countries have national wetland inventories, and 32 have inventories of sites of international importance listed under the Ramsar Convention. There has been a trend in producing regional or continental inventories for wetlands that are important as waterfowl habitat. There is an urgent need to produce wetland inventories for all European countries. The Ramsar database takes into consideration hydrological and economic wetland values, as well as ecological ones. The Ramsar classification lists a total of 35 wetland types, and is sufficiently flexible that it could be used for classifying European wetlands at the national scale. 相似文献
Current proteomics experiments can generate vast quantities of data very quickly, but this has not been matched by data analysis capabilities. Although there have been a number of recent reviews covering various aspects of peptide and protein identification methods using MS, comparisons of which methods are either the most appropriate for, or the most effective at, their proposed tasks are not readily available. As the need for high-throughput, automated peptide and protein identification systems increases, the creators of such pipelines need to be able to choose algorithms that are going to perform well both in terms of accuracy and computational efficiency. This article therefore provides a review of the currently available core algorithms for PMF, database searching using MS/MS, sequence tag searches and de novo sequencing. We also assess the relative performances of a number of these algorithms. As there is limited reporting of such information in the literature, we conclude that there is a need for the adoption of a system of standardised reporting on the performance of new peptide and protein identification algorithms, based upon freely available datasets. We go on to present our initial suggestions for the format and content of these datasets. 相似文献
In recent years proteomics became increasingly important to functional genomics. Although a large amount of data is generated by high throughput large‐scale techniques, a connection of these mostly heterogeneous data from different analytical platforms and of different experiments is limited. Data mining procedures and algorithms are often insufficient to extract meaningful results from large datasets and therefore limit the exploitation of the generated biological information. In our proteomic core facility, which almost exclusively focuses on 2‐DE/MS‐based proteomics, we developed a proteomic database custom tailored to our needs aiming at connecting MS protein identification information to 2‐DE derived protein expression profiles. The tools developed should not only enable an automatic evaluation of single experiments, but also link multiple 2‐DE experiments with MS‐data on different levels and thereby helping to create a comprehensive network of our proteomics data. Therefore the key feature of our “PROTEOMER” database is its high cross‐referencing capacity, enabling integration of a wide range of experimental data. To illustrate the workflow and utility of the system, two practical examples are provided to demonstrate that proper data cross‐referencing can transform information into biological knowledge. 相似文献
Database searches can fail to detect all truly homologous sequences, particularly when dealing with short, highly sequence diverse protein families. Here, using microtubule interacting and transport (MIT) domains as an example, we have applied an approach of profile-profile matching followed by ab initio structure modelling to the detection of true homologues in the borderline significant zone of database searches. Novel MIT domains were confidently identified in USP54, containing an apparently inactive ubiquitin carboxyl-terminal hydrolase domain, a katanin-like ATPase KATNAL1, and an uncharacterized protein containing a VPS9 domain. As a proof of principle, we have confirmed the novel MIT annotation for USP54 by in vitro profiling of binding to CHMP proteins.
Fungi belong to the large kingdom of lower eukaryotic organisms encompassing yeasts along with filamentous and dimorphic members. Microbial P450 enzymes have contributed to exploration of and adaptation to diverse ecological niches such as conversion of lipophilic compounds to more hydrophilic derivatives or degradation of a vast array of environmental toxicants. To better understand diversification of the catalytic behavior of fungal P450s, detailed insight into the molecular machinery steering oxidative attack on the distinctly structured endogenous and xenobiotic substrates is of preeminent interest. Based on a general, CYP102A1-related template the bulk of predicted substrate/inhibitor-binding determinants were shown to cluster near the distal heme face within the six known substrate recognition sites (SRSs) made up by the α-helical B′/F/G/I tetrad, the B′–C interhelical loop and strands of the β6-sheet, population density being highest in the structurally flexible SRS-1 and SRS-4 domains, showing a low degree of conservation. Reactivity toward ligands favorably coincides with the lipophilicity/hydrophilicity profile and bulkiness of critical amino acids acting as selective filters. Some decisive elements may also serve in maintenance of catalytic competence via their action as gatekeepers directing substrate access/positioning or stabilizers of the heme environment enabling dioxygen activation. Non-SRS residues seem to control spin state equilibria and attract redox partners by electrostatic forces. Of note, the inhibitory potency of azole-type fungicides is likely to arise from perturbation of the complex interplay of the mechanistic principles addressed above. Knowledge-supported exploitation of the topological data will be helpful in the manufacture of commodity/specialty chemicals as well as therapeutic agents. Also, engineered fungal P450s may be used to improve pollutant-specific bioremediation of contaminated soils. 相似文献