首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Guo Y  Li M  Lu M  Wen Z  Huang Z 《Proteins》2006,65(1):55-60
Determining G-protein coupled receptors (GPCRs) coupling specificity is very important for further understanding the functions of receptors. A successful method in this area will benefit both basic research and drug discovery practice. Previously published methods rely on the transmembrane topology prediction at training step, even at prediction step. However, the transmembrane topology predicted by even the best algorithm is not of high accuracy. In this study, we developed a new method, autocross-covariance (ACC) transform based support vector machine (SVM), to predict coupling specificity between GPCRs and G-proteins. The primary amino acid sequences are translated into vectors based on the principal physicochemical properties of the amino acids and the data are transformed into a uniform matrix by applying ACC transform. SVMs for nonpromiscuous coupled GPCRs and promiscuous coupled GPCRs were trained and validated by jackknife test and the results thus obtained are very promising. All classifiers were also evaluated by the test datasets with good performance. Besides the high prediction accuracy, the most important feature of this method is that it does not require any transmembrane topology prediction at either training or prediction step but only the primary sequences of proteins. The results indicate that this relatively simple method is applicable. Academic users can freely download the prediction program at http://www.scucic.net/group/database/Service.asp.  相似文献   

2.
Modifications of two microbiological qualitative methods for detection of beta-lactamase-producing microbial strains are described. The methods are based on the principle of Goths. One of the methods named "contact" significantly differs from the prototype and is more simple, convenient, easily reproducible and does not require the use of special equipment and special training of the staff.  相似文献   

3.
4.
Investigators planning to use animals in their research and the Institutional Animal Care and Use Committee (IACUC) members who review the research protocols must take personal responsibility for ensuring that they have the skills and knowledge to perform their duties, applying the Three Rs principles of Russell and Burch. The two Korean laws introduced in 2008 and 2009 regulating animal use for scientific purposes in line with the Three Rs principles have been revised a total of 11 times over the last 6 years. Both regulatory agencies, e.g., the Animal and Plant Quarantine Agency and the Ministry of Food and Drug Safety, provide regular training based on the legal requirements. Based on the amended Animal Welfare Act, the IACUC appointment framework has been upgraded: appointments are now for two-year terms and require a qualified training certificate issued by the Animal and Plant Quarantine Agency since 2012. The authors reviewed the current curricular programs and types of training conducted by the two governing agencies through Internet searches. Our Internet survey results suggest that: a) diversity should be provided in training curricula, based on the roles, backgrounds and needs of the individual trainees; b) proper and continued educational programs should be provided, based on trainees’ experiences; and c) active encouragement by government authorities can improve the quality of training curricula. [BMB Reports 2014; 47(4): 179-183]  相似文献   

5.
Successful predictions of peptide MHC binding typically require a large set of binding data for the specific MHC molecule that is examined. Structure based prediction methods promise to circumvent this requirement by evaluating the physical contacts a peptide can make with an MHC molecule based on the highly conserved 3D structure of peptide:MHC complexes. While several such methods have been described before, most are not publicly available and have not been independently tested for their performance. We here implemented and evaluated three prediction methods for MHC class II molecules: statistical potentials derived from the analysis of known protein structures; energetic evaluation of different peptide snapshots in a molecular dynamics simulation; and direct analysis of contacts made in known 3D structures of peptide:MHC complexes. These methods are ab initio in that they require structural data of the MHC molecule examined, but no specific peptide:MHC binding data. Moreover, these methods retain the ability to make predictions in a sufficiently short time scale to be useful in a real world application, such as screening a whole proteome for candidate binding peptides. A rigorous evaluation of each methods prediction performance showed that these are significantly better than random, but still substantially lower than the best performing sequence based class II prediction methods available. While the approaches presented here were developed independently, we have chosen to present our results together in order to support the notion that generating structure based predictions of peptide:MHC binding without using binding data is unlikely to give satisfactory results.  相似文献   

6.
The accuracy of base calls produced by Illumina sequencers is adversely affected by several processes, with laser cross-talk and cluster phasing being prominent. We introduce an explicit statistical model of the sequencing process that generalizes current models of phasing and cross-talk and forms the basis of a base calling method which improves on the best existing base callers, especially when comparing the number of error-free reads. The novel algorithms implemented in All Your Base (AYB) are comparable in speed to other competitive base-calling methods, do not require training data and are designed to be robust to gross errors, producing sensible results where other techniques struggle. AYB is available at http://www.ebi.ac.uk/goldman-srv/AYB/.  相似文献   

7.
Machine learning methods, in particular convolutional neural networks, have been applied to a variety of problems in cryo-EM and macromolecular crystallographic structure solution. However, they still have only limited acceptance by the community, mainly in areas where they replace repetitive work and allow for easy visual checking, such as particle picking, crystal centering or crystal recognition. With Artificial Intelligence (AI) based protein fold prediction currently revolutionizing the field, it is clear that their scope could be much wider. However, whether we will be able to exploit this potential fully will depend on the manner in which we use machine learning: training data must be well-formulated, methods need to utilize appropriate architectures, and outputs must be critically assessed, which may even require explaining AI decisions.  相似文献   

8.
Analysis of cellular phenotypes in large imaging data sets conventionally involves supervised statistical methods, which require user-annotated training data. This paper introduces an unsupervised learning method, based on temporally constrained combinatorial clustering, for automatic prediction of cell morphology classes in time-resolved images. We applied the unsupervised method to diverse fluorescent markers and screening data and validated accurate classification of human cell phenotypes, demonstrating fully objective data labeling in image-based systems biology.  相似文献   

9.
Hwang H  Pierce B  Mintseris J  Janin J  Weng Z 《Proteins》2008,73(3):705-709
We present version 3.0 of our publicly available protein-protein docking benchmark. This update includes 40 new test cases, representing a 48% increase from Benchmark 2.0. For all of the new cases, the crystal structures of both binding partners are available. As with Benchmark 2.0, Structural Classification of Proteins (Murzin et al., J Mol Biol 1995;247:536-540) was used to remove redundant test cases. The 124 unbound-unbound test cases in Benchmark 3.0 are classified into 88 rigid-body cases, 19 medium-difficulty cases, and 17 difficult cases, based on the degree of conformational change at the interface upon complex formation. In addition to providing the community with more test cases for evaluating docking methods, the expansion of Benchmark 3.0 will facilitate the development of new algorithms that require a large number of training examples. Benchmark 3.0 is available to the public at http://zlab.bu.edu/benchmark.  相似文献   

10.
MOTIVATION: Modern strategies for mapping disease loci require efficient genotyping of a large number of known polymorphic sites in the genome. The sensitive and high-throughput nature of hybridization-based DNA microarray technology provides an ideal platform for such an application by interrogating up to hundreds of thousands of single nucleotide polymorphisms (SNPs) in a single assay. Similar to the development of expression arrays, these genotyping arrays pose many data analytic challenges that are often platform specific. Affymetrix SNP arrays, e.g. use multiple sets of short oligonucleotide probes for each known SNP, and require effective statistical methods to combine these probe intensities in order to generate reliable and accurate genotype calls. RESULTS: We developed an integrated multi-SNP, multi-array genotype calling algorithm for Affymetrix SNP arrays, MAMS, that combines single-array multi-SNP (SAMS) and multi-array, single-SNP (MASS) calls to improve the accuracy of genotype calls, without the need for training data or computation-intensive normalization procedures as in other multi-array methods. The algorithm uses resampling techniques and model-based clustering to derive single array based genotype calls, which are subsequently refined by competitive genotype calls based on (MASS) clustering. The resampling scheme caps computation for single-array analysis and hence is readily scalable, important in view of expanding numbers of SNPs per array. The MASS update is designed to improve calls for atypical SNPs, harboring allele-imbalanced binding affinities, that are difficult to genotype without information from other arrays. Using a publicly available data set of HapMap samples from Affymetrix, and independent calls by alternative genotyping methods from the HapMap project, we show that our approach performs competitively to existing methods. AVAILABILITY: R functions are available upon request from the authors.  相似文献   

11.
MOTIVATION: A number of methods have been developed to predict functional specificity determinants in protein families based on sequence information. Most of these methods rely on pre-defined functional subgroups. Manual subgroup definition is difficult because of the limited number of experimentally characterized subfamilies with differing specificity, while automatic subgroup partitioning using computational tools is a non-trivial task and does not always yield ideal results. RESULTS: We propose a new approach SPEL (specificity positions by evolutionary likelihood) to detect positions that are likely to be functional specificity determinants. SPEL, which does not require subgroup definition, takes a multiple sequence alignment of a protein family as the only input, and assigns a P-value to every position in the alignment. Positions with low P-values are likely to be important for functional specificity. An evolutionary tree is reconstructed during the calculation, and P-value estimation is based on a random model that involves evolutionary simulations. Evolutionary log-likelihood is chosen as a measure of amino acid distribution at a position. To illustrate the performance of the method, we carried out a detailed analysis of two protein families (LacI/PurR and G protein alpha subunit), and compared our method with two existing methods (evolutionary trace and mutual information based). All three methods were also compared on a set of protein families with known ligand-bound structures. AVAILABILITY: SPEL is freely available for non-commercial use. Its pre-compiled versions for several platforms and alignments used in this work are available at ftp://iole.swmed.edu/pub/SPEL/  相似文献   

12.
Public trust demands that individuals who do research, testing, or teaching with animals use humane, ethical, and scientifically sound methods. Furthermore, the Animal Welfare Act and the Public Health Service Policy require research institutions to provide basic training and to ensure that anyone who cares for and/or works with laboratory animals has the appropriate training or experience relevant to their job responsibilities. Institutions accredited by the Association for Assessment and Accreditation of Laboratory Animal Care International must also provide training programs and ensure the qualifications of personnel. The primary goal of this training is to provide individuals with basic knowledge and to reinforce attitudes and behaviors that help to ensure humane animal care and use. This article provides an overview of the core training module outline and content from the 1991 report of the Institute for Laboratory Animal Research, Education and Training in the Care and Use of Laboratory Animals: A Guide for Developing Institutional Programs, as well as pertinent updates for introducing personnel to information regarding the care and use of laboratory animals. Both mandatory and suggested training topics are reviewed, including relevant regulations and standards, ethical considerations, humane methods of animal experimentation and maintenance, and other pertinent topics. Although the fundamental training course content and delivery will vary depending on the nature and complexity of an institution's animal care and use program, this basic training provides the foundation for more in-depth training programs and supports humane and ethical animal care and use.  相似文献   

13.
14.
MOTIVATION: Gene expression data often contain missing expression values. Effective missing value estimation methods are needed since many algorithms for gene expression data analysis require a complete matrix of gene array values. In this paper, imputation methods based on the least squares formulation are proposed to estimate missing values in the gene expression data, which exploit local similarity structures in the data as well as least squares optimization process. RESULTS: The proposed local least squares imputation method (LLSimpute) represents a target gene that has missing values as a linear combination of similar genes. The similar genes are chosen by k-nearest neighbors or k coherent genes that have large absolute values of Pearson correlation coefficients. Non-parametric missing values estimation method of LLSimpute are designed by introducing an automatic k-value estimator. In our experiments, the proposed LLSimpute method shows competitive results when compared with other imputation methods for missing value estimation on various datasets and percentages of missing values in the data. AVAILABILITY: The software is available at http://www.cs.umn.edu/~hskim/tools.html CONTACT: hpark@cs.umn.edu  相似文献   

15.
The sampling and analytical methods, along with available microorganisms, used for in situ hydrocarbon bioremediation are reviewed. Each treatment method is briefly described and its advantages and limitations pertaining to potential applications are evaluated. Bioremediation provides cost-effective, contaminant- and substrate-specific treatments equally successful in reducing the concentrations of single compounds or mixtures of biodegradable materials. In situ treatments rarely yield undesirable byproducts, but precautions and preliminary baseline tests are always recommended. Sampling methods should adhere to good laboratory and field practices and usually do not require highly trained personnel. Analytical methods vary in sensitivity, cost, duration of sample analysis and personnel training required. Voucher specimens of bacterial strains used in bioremediation exist in various repositories (e.g. ATCC, DSM, etc.) or are commercially available, and are usually covered by patent rights. Each one of these strains may yield spectacular results in vitro for specific target compounds. However, the overall success of such strains in treating a wide range of contaminants in situ remains limited. The reintroduction of indigenous microorganisms isolated from the contaminated site after culturing seems to be a highly effective bioremediation method, especially when microorganism growth is supplemented by oxygen and fertilizers. Received: 10 June 1997 / Received revision: 14 August 1997 / Accepted: 25 August 1997  相似文献   

16.
Metaproteomic studies of full‐scale activated sludge systems require reproducible protein extraction methods. A systematic evaluation of three different extractions protocols, each in combination with three different methods of cell lysis, and a commercial kit were evaluated. Criteria used for comparison of each method included the extracted protein concentration and the number of identified proteins and peptides as well as their phylogenetic, cell localization and functional distribution and quantitative reproducibility. Furthermore, the advantage of using specific metagenomes and a 2‐step database approach was illustrated. The results recommend a protocol for protein extraction from activated sludge based on the protein extraction reagent B‐Per and bead beating. The data have been deposited to the ProteomeXchange with identifier PXD000862 ( http://proteomecentral.proteomexchange.org/dataset/PXD000862 ).  相似文献   

17.
MOTIVATION: Remote homology detection is among the most intensively researched problems in bioinformatics. Currently discriminative approaches, especially kernel-based methods, provide the most accurate results. However, kernel methods also show several drawbacks: in many cases prediction of new sequences is computationally expensive, often kernels lack an interpretable model for analysis of characteristic sequence features, and finally most approaches make use of so-called hyperparameters which complicate the application of methods across different datasets. RESULTS: We introduce a feature vector representation for protein sequences based on distances between short oligomers. The corresponding feature space arises from distance histograms for any possible pair of K-mers. Our distance-based approach shows important advantages in terms of computational speed while on common test data the prediction performance is highly competitive with state-of-the-art methods for protein remote homology detection. Furthermore the learnt model can easily be analyzed in terms of discriminative features and in contrast to other methods our representation does not require any tuning of kernel hyperparameters. AVAILABILITY: Normalized kernel matrices for the experimental setup can be downloaded at www.gobics.de/thomas. Matlab code for computing the kernel matrices is available upon request. CONTACT: thomas@gobics.de, peter@gobics.de.  相似文献   

18.
A protein is generally classified into one of the following four structural classes: all alpha, all beta, alpha+beta and alpha/beta. In this paper, based on the weighting to the 20 constituent amino acids, a new method is proposed for predicting the structural class of a protein according to its amino acid composition. The 20 weighting parameters, which reflect the different properties of the 20 constituent amino acids, have been obtained from a training set of proteins through the linear-programming approach. The rate of correct prediction for a training set of proteins by means of the new method was 100%, whereas the highest rate of previous methods was 82.8%. Furthermore, the results showed that the more numerous training proteins, the more effective the new method.  相似文献   

19.
Sequence-based residue contact prediction plays a crucial role in protein structure reconstruction. In recent years, the combination of evolutionary coupling analysis (ECA) and deep learning (DL) techniques has made tremendous progress for residue contact prediction, thus a comprehensive assessment of current methods based on a large-scale benchmark data set is very needed. In this study, we evaluate 18 contact predictors on 610 non-redundant proteins and 32 CASP13 targets according to a wide range of perspectives. The results show that different methods have different application scenarios: (1) DL methods based on multi-categories of inputs and large training sets are the best choices for low-contact-density proteins such as the intrinsically disordered ones and proteins with shallow multi-sequence alignments (MSAs). (2) With at least 5L (L is sequence length) effective sequences in the MSA, all the methods show the best performance, and methods that rely only on MSA as input can reach comparable achievements as methods that adopt multi-source inputs. (3) For top L/5 and L/2 predictions, DL methods can predict more hydrophobic interactions while ECA methods predict more salt bridges and disulfide bonds. (4) ECA methods can detect more secondary structure interactions, while DL methods can accurately excavate more contact patterns and prune isolated false positives. In general, multi-input DL methods with large training sets dominate current approaches with the best overall performance. Despite the great success of current DL methods must be stated the fact that there is still much room left for further improvement: (1) With shallow MSAs, the performance will be greatly affected. (2) Current methods show lower precisions for inter-domain compared with intra-domain contact predictions, as well as very high imbalances in precisions between intra-domains. (3) Strong prediction similarities between DL methods indicating more feature types and diversified models need to be developed. (4) The runtime of most methods can be further optimized.  相似文献   

20.
MOTIVATION: To predict which of the vast number of human single nucleotide polymorphisms (SNPs) are deleterious to gene function or likely to be disease associated is an important problem, and many methods have been reported in the literature. All methods require data sets of mutations classified as 'deleterious' or 'neutral' for training and/or validation. While different workers have used different data sets there has been no study of which is best. Here, the three most commonly used data sets are analysed. We examine their contents and relate this to classifiers, with the aims of revealing the strengths and pitfalls of each data set, and recommending a best approach for future studies. RESULTS: The data sets examined are shown to be substantially different in content, particularly with regard to amino acid substitutions, reflecting the different ways in which they are derived. This leads to differences in classifiers and reveals some serious pitfalls of some data sets, making them less than ideal for non-synonymous SNP prediction. AVAILABILITY: Software is available on request from the authors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号