首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The identification of protein function based on biological information is an area of intense research. Here we consider a complementary technique that quantitatively groups and relates proteins based on the chemical similarity of their ligands. We began with 65,000 ligands annotated into sets for hundreds of drug targets. The similarity score between each set was calculated using ligand topology. A statistical model was developed to rank the significance of the resulting similarity scores, which are expressed as a minimum spanning tree to map the sets together. Although these maps are connected solely by chemical similarity, biologically sensible clusters nevertheless emerged. Links among unexpected targets also emerged, among them that methadone, emetine and loperamide (Imodium) may antagonize muscarinic M3, alpha2 adrenergic and neurokinin NK2 receptors, respectively. These predictions were subsequently confirmed experimentally. Relating receptors by ligand chemistry organizes biology to reveal unexpected relationships that may be assayed using the ligands themselves.  相似文献   

3.
The role of recombination and mutation in 16S-23S rDNA spacer rearrangements   总被引:25,自引:0,他引:25  
Gürtler V 《Gene》1999,238(1):241-252
The intragenomic heterogeneity of the bacterial intergenic (16S-23S rDNA) spacer region (ISR) was analysed from the following species in which sequences for the complete rRNA operon (rrn) set have been determined (rrn number): Enterococcus faecalis (6) and E. faecium (6), Bacillus subtilis (10), Staphylococcus aureus (9), Vibrio cholerae (4), Haemophilus influenzae (6) and Escherichia coli (7). It was found that some spacer sequence blocks were highly conserved between operons of a genome, whereas the presence of others was variable. When these variations were analysed using the program PLATO and partial likelihood phylogenies determined by DNAml for each operon set, three regions showed significant (Z>3.3) spatial variation [Region I was 78-184 nt long (2.14.4) possibly due to recombination or selection. Within Region I, there was sequence block variation in all operon sets [some operons contained tRNA genes (tRNAala, tRNAile or tRNAglu), whereas others had sequence blocks such as VS2 (S. aureus) or rsl (E. coli)]. Q Analysis of the ISR sequence from E. faecalis and E. faecium showed that there was more interspecies than intraspecies variation (both in DNA sequence and in the presence or absence of blocks). Dot matrix analysis of the sequence blocks in the nine rrn ISRs from S. aureus showed that there was significant homology between VS2 and VS5/VS6. Furthermore, repeat motifs with only A or T were present in higher copy numbers in VS5/VS6 than in VS2. Since these sequence blocks (VS2 and VS5-VS6) are related, intragenic evolution resulting in AT expansion may have occurred between these two regions. A model is proposed that postulates a role for recombination and AT-expansion in intra-genomic ISR variations. This process may represent a general mechanism of concerted evolution for bacterial ISR rearrangements.  相似文献   

4.
Crystal structure data of globular proteins were used to prepare (phi, psi) probability maps of 20 proteinous amino acids. These maps were compared grid-wise with each other and a conformational similarity index was calculated for each pair of amino acids. A weight matrix, called Conformational Similarity Weight (CSW) matrix, was prepared using the conformational similarity index. This weight matrix was used to align sequences of 21 pairs of proteins whose crystal structures are known. The aligned regions with more than seven contiguous amino acids were further analysed by plotting average weight (W) values of overlapping hepatapeptides in these regions and carrying out curve fitting by Fourier series having TEN harmonics. The protein fragments corresponding to the half-linewidth of peaks were predicted as fragments having similar conformation in the protein pair under consideration. Such an approach allows us to pick up conformationally similar protein fragments with more than 67% accuracy.  相似文献   

5.
Both the ordered and disordered solvent networks of vitamin B12 coenzyme crystal hydrate have been generated by Monte Carlo simulation techniques. Several different potential functions have been use to model both water-water and water-solute (i.e., water-coenzyme) interactions. The results have been analysed in terms of the structural properties of the water networks, such as mean water oxygen and hydrogen positions, coordination of each water molecule, and maxima of probability density maps in all four asymmetric units of this crystal.The following results were found: (I) Within each asymmetric unit only one hydrogen bonding network was predicted although there were several hydrogen atom positions for any one solvent molecule (defined as maxima in probability density). (II) Reasonable agreement was obtained between predicted and experimental positions in the ordered solvent region, independent of the potential function used. (III) The positions of the calculated probability density maxima for the disordered channel region were different in different asymmetric units; this led to different simulated hydrogen bond networks which were not always consistent with the experimentally determined alternative (lower occupancy) sites.The results suggest that it is advisable to simulate more than one asymmetric unit if one wishes to look at disorder in the solvent regions. Probability density maps were qualitatively very useful for picturing these disordered regions. However, there were no significant differences between quantitative results predicted using either average atomic positions or maxima of the probability density distributions.Problems in quantifying agreement between experimental and predicted disordered solvent networks are discussed. The potential which included hydrogen atoms explicitly (EMPWI) seemed to give the best overall agreement, mainly because it was successful in predicting the unusually short hydrogen bonds which are found in this crystal.  相似文献   

6.
Results of classification of terrestrial ecosystems using an average similarity matrix are reported for the West Siberian Plain. Initial indices are first calculated separately for four components of an ecosystem. These components (blocks) include the underground block (soil humus, mortmass, and underground phytomass), above-ground vegetation, and invertebrates and vertebrates. Mismatch of boundaries in separate blocks of ecosystems and in comparison with the inhomogeneity of ecosystems in general was demonstrated. These differences are observed in both the typological and typological-chorological analysis. The indicated features of spatial succession within the blocks generate continuity of ecosystems and the conventional character of all the classifications and drawn boundaries.  相似文献   

7.
Bissantz C  Schalon C  Guba W  Stahl M 《Proteins》2005,61(4):938-952
The aim of this study was to investigate the usefulness of structure-based virtual screening (VS) for focused library design in G protein-coupled receptors (GPCR) projects on the example of 5-HT(2c) agonists. We compared the performance of structure-based VS against two different homology models using FRED for docking and ScreenScore, FlexX, and PMF for rescoring with the results of 12 ligand-based similarity searches using four different query compounds and three different similarity metrics (Daylight, FTree, Phacir). The result of the similarity search showed much variation, from an enrichment factor up to 3.2 to worse than random, whereas the structure-based VS gave a more stable result with a constant enrichment factor around 2. Additionally, actives retrieved by the structure-based approach were more diverse than the actives among the top scorers of the similarity searches. Based on these results, we suggest basing a focused library design for a GPCR project on a combination of a ligand-based similarity search and structure-based docking.  相似文献   

8.
A two-stage neural network has been used to predict protein secondary structure based on the position specific scoring matrices generated by PSI-BLAST. Despite the simplicity and convenience of the approach used, the results are found to be superior to those produced by other methods, including the popular PHD method according to our own benchmarking results and the results from the recent Critical Assessment of Techniques for Protein Structure Prediction experiment (CASP3), where the method was evaluated by stringent blind testing. Using a new testing set based on a set of 187 unique folds, and three-way cross-validation based on structural similarity criteria rather than sequence similarity criteria used previously (no similar folds were present in both the testing and training sets) the method presented here (PSIPRED) achieved an average Q3 score of between 76.5% to 78.3% depending on the precise definition of observed secondary structure used, which is the highest published score for any method to date. Given the success of the method in CASP3, it is reasonable to be confident that the evaluation presented here gives a fair indication of the performance of the method in general.  相似文献   

9.
Facioscapulohumeral muscular dystrophy (FSHMD) is a neuromuscular disorder characterized by autosomal dominant inheritance and clinical onset in the muscles of the face and shoulder girdle. Using a set of RFLP markers spaced at approximately 20 centimorgans, we have begun a systematic search for markers linked to the disease. A total of 81 RFLP loci on six autosomes (1, 2, 5, 7, 10, and 16) have been examined for linkage to FSHMD in 13 families. With the computer program CRI-MAP, two-point and multipoint analyses have not resulted in any LOD score indicative of linkage to FSHMD. However, these analyses have allowed us to exclude 909 centimorgans (sex average) of our genetic maps in intervals where the LOD score is less than -2.0. We estimate our data have excluded 23% of the human genome.  相似文献   

10.
Underprivileged areas were identified by weighting several census variables that relate to social conditions, by using weights determined by means of a questionnaire sent to one in 10 of the general practitioners in the United Kingdom. The weighted variables were added (after statistical manipulation) to give a score for each of the 9265 electoral wards in England and Wales. Blank ward maps were sent to general practitioners in five family practitioner committee areas and they were asked to shade the wards according to the degree to which the population increased their workload or the pressure on their services. Maps of these same areas were then prepared by using the calculated scores with the cut off points between the worst, the intermediate, and the best areas as on those used by the general practitioners. The two sets of maps were then compared to determine how well the maps that were based on scores agreed with the general practitioners'' maps showing their assessment of the variation of workload in their areas. Overall, 6.3% of the wards differed in shading in any way between the two sets of maps. In the three areas where the general practitioners shaded complete wards and did not report having difficulties with shading only 1.2% of the wards differed. It may be possible to use these "underprivileged area" scores to indicate where problems occur for general practitioners and to extend this work to other primary health care workers.  相似文献   

11.
Helgason CM  Jobe TH 《PloS one》2008,3(4):e1909
BACKGROUND: It has been shown that the clinical state of one patient can be represented by known measured variables of interest, each of which then form the element of a fuzzy set as point in the unit hypercube. We hypothesized that precise comparison of a single patient with the average patient of a large double blind controlled randomized study is possible using fuzzy theory. METHODS/PRINCIPLE FINDINGS: The sets as points unit hypercube geometry allows fuzzy subsethood to define in measures of fuzzy cardinality different conditions, similarity and comparison between fuzzy sets. A fuzzy measure of prediction is defined from fuzzy measures of similarity and comparison. It is a measure of the degree to which fuzzy set A is similar to fuzzy set B when different conditions are taken into account and removed from the comparison. When represented as a fuzzy set as point in the unit hypercube, a clinical patient can be compared to an average patient of a large group study in a precise manner. This comparison is expressed by the fuzzy prediction measure. This measure in itself is not a probability. Once thus precisely matched to the average patient of a large group study, risk reduction is calculated by multiplying the measured similarity of the clinical patient to the risk of the average trial patient. CONCLUSION/SIGNIFICANCE: Otherwise not precisely translatable to the single case, the result of group statistics can be applied to the single case through the use of fuzzy subsethood and measured in fuzzy cardinality. This measure is an alternative to a Bayesian or other probability based statistical approach.  相似文献   

12.
Small molecule drugs target many core metabolic enzymes in humans and pathogens, often mimicking endogenous ligands. The effects may be therapeutic or toxic, but are frequently unexpected. A large-scale mapping of the intersection between drugs and metabolism is needed to better guide drug discovery. To map the intersection between drugs and metabolism, we have grouped drugs and metabolites by their associated targets and enzymes using ligand-based set signatures created to quantify their degree of similarity in chemical space. The results reveal the chemical space that has been explored for metabolic targets, where successful drugs have been found, and what novel territory remains. To aid other researchers in their drug discovery efforts, we have created an online resource of interactive maps linking drugs to metabolism. These maps predict the “effect space” comprising likely target enzymes for each of the 246 MDDR drug classes in humans. The online resource also provides species-specific interactive drug-metabolism maps for each of the 385 model organisms and pathogens in the BioCyc database collection. Chemical similarity links between drugs and metabolites predict potential toxicity, suggest routes of metabolism, and reveal drug polypharmacology. The metabolic maps enable interactive navigation of the vast biological data on potential metabolic drug targets and the drug chemistry currently available to prosecute those targets. Thus, this work provides a large-scale approach to ligand-based prediction of drug action in small molecule metabolism.  相似文献   

13.
The aim of our work was to study the opposite polarity of the PQ segment to the P wave body surface potential maps in different groups of patients. We constructed isointegral maps (IIM) in 26 healthy controls (C), 16 hypertensives (HT), 26 patients with arterial hypertension and left ventricular hypertrophy (LVH) and 15 patients with myocardial infarction (MI). We analyzed values and positions of map extrema and compared the polarity of maps using the correlation coefficient. The IIM P maxima appeared mainly over the precordium, the minima mainly in the right subclavicular area. The highest maxima were in the MI group, being significantly higher than in the HT and LVH groups. No differences concerning any values of other extrema were significant. The IIM PQ maxima were distributed over the upper half of the chest; the minima mainly over the middle sternum. A statistically significant opposite polarity between the IIM P and IIM PQ was found in 80 % of cases. The opposite polarity of the P wave and the PQ segment was proved in isointegral body surface maps. The extrema occurred in areas not examined by the standard chest leads. This has to be considered for diagnostic purposes.  相似文献   

14.
Single-particle analysis is a structure determining method using electron microscopic (EM) images, which does not require protein crystal. In this method, projections are picked up and used to reconstruct a three-dimensional (3D) structure. When the conical tilting method is not available, the particle images are usually classified and averaged to improve the signal-to-noise ratio. The Euler angles of these average images must be posteriorically assigned to create a primary 3D model. We developed a new, fully automatic unsupervised Euler angle assignment method, which does not require an initial 3D reference and which is applicable to asymmetric molecules. In this method, the Euler angle of each average image is initially set randomly and then automatically corrected in relation to those of the other averages by iterated optimizations using the Simulated Annealing (SA) algorithm. At each iteration, the 3D structure is reconstructed based on the current Euler angles and reprojected back in the average-input directions. A modified cross-correlation between each reprojection and its corresponding original average is then calculated. The correlations are summed as a total 3D echo-correlation score to evaluate the Euler angles at this iteration. Then, one of the projections is selected, its Euler angle is changed randomly, and the score is also calculated. Based on the score change, judgment of whether to accept or reject the new angle is made using the SA algorithm, which is introduced to overcome the local minimums. After a certain number of iterations of this process, the angles of all averages converge so as to create a reliable primary 3D model. This echo-correlated 3D reconstruction with simulated annealing also has potential for wide application to general 3D reconstruction from various types of 2D images.  相似文献   

15.
Genome-wide linkage analysis using microsatellite markers has been successful in the identification of numerous Mendelian and complex disease loci. The recent availability of high-density single-nucleotide polymorphism (SNP) maps provides a potentially more powerful option. Using the simulated and Collaborative Study on the Genetics of Alcoholism (COGA) datasets from the Genetics Analysis Workshop 14 (GAW14), we examined how altering the density of SNP marker sets impacted the overall information content, the power to detect trait loci, and the number of false positive results. For the simulated data we used SNP maps with density of 0.3 cM, 1 cM, 2 cM, and 3 cM. For the COGA data we combined the marker sets from Illumina and Affymetrix to create a map with average density of 0.25 cM and then, using a sub-sample of these markers, created maps with density of 0.3 cM, 0.6 cM, 1 cM, 2 cM, and 3 cM. For each marker set, multipoint linkage analysis using MERLIN was performed for both dominant and recessive traits derived from marker loci. Our results showed that information content increased with increased map density. For the homogeneous, completely penetrant traits we created, there was only a modest difference in ability to detect trait loci. Additionally, as map density increased there was only a slight increase in the number of false positive results when there was linkage disequilibrium (LD) between markers. The presence of LD between markers may have led to an increased number of false positive regions but no clear relationship between regions of high LD and locations of false positive linkage signals was observed.  相似文献   

16.
Herring AH  Dunson DB  Dole N 《Biometrics》2004,60(4):926-935
Researchers often measure stress using questionnaire data on the occurrence of potentially stress-inducing life events and the strength of reaction to these events, characterized as negative or positive and assigned an ordinal ranking. In studying the health effects of stress, one needs to obtain measures of an individual's negative and positive stress levels to be used as predictors. Motivated by data of this type, we propose a latent variable model, which is characterized by event-specific negative and positive reaction scores. If the positive reaction score dominates the negative reaction score for an event, then the individual's reported response to that event will be positive, with an ordinal ranking determined by the value of the score. Measures of overall positive and negative stress can be obtained by summing the reactivity scores across the events that occur for an individual. By incorporating these measures as predictors in a regression model and fitting the stress and outcome models jointly using Bayesian methods, inferences can be conducted without the need to assume known weights for the different events. We propose an MCMC algorithm for posterior computation and apply the approach to study the effects of stress on preterm delivery.  相似文献   

17.
A new similarity score (sigma-score) is proposed which is able to find the correct protein structure among the very close alternatives and to distinguish between correct and deliberately misfolded structures. This score is based on the general principle 'similar likes similar', and it favors hydrophobic and hydrophilic contacts, and disfavors hydrophobic-to-hydrophilic contacts in proteins. The values of sigma-scores calculated for the high-resolution protein structures from the representative set are compared with those of alternatives: (i) very close alternatives which are only slightly distorted by conformational energy minimization in vacuo; (ii) alternatives with subsequently growing distortions, generated by molecular dynamics simulations in vacuo; (iii) structures derived by molecular dynamics simulation in solvent at 300 K; (iv) deliberately misfolded protein models. In nearly all tested cases the similarity score can successfully distinguish between experimental structure and its alternatives, even if the root mean square displacement of all heavy atoms is less than 1 A. The confidence interval of the similarity score was estimated using the high-resolution X-ray structures of domain pairs related by non-crystallographic symmetry. The similarity score can be used for the evaluation of the general quality of the protein models, choosing the correct structures among the very close alternatives, characterization of models simulating folding/unfolding, etc.  相似文献   

18.
Annotations of the genes and their products are largely guided by inferring homology. Sequence similarity is the primary measure used for annotation purpose however, the domain content and order were given less importance albeit the fact that domain insertion, deletion, positional changes can bring in functional varieties. Of late, several methods developed quantify domain architecture similarity depending on alignments of their sequences and are focused on only homologous proteins. We present an alignment-free domain architecture-similarity search (ADASS) algorithm that identifies proteins that share very poor sequence similarity yet having similar domain architectures. We introduce a “singlet matching-triplet comparison” method in ADASS, wherein triplet of domains is compared with other triplets in a pair-wise comparison of two domain architectures. Different events in the triplet comparison are scored as per a scoring scheme and an average pairwise distance score (Domain Architecture Distance score - DAD Score) is calculated between protein domains architectures. We use domain architectures of a selected domain termed as centric domain and cluster them based on DAD score. The algorithm has high Positive Prediction Value (PPV) with respect to the clustering of the sequences of selected domain architectures. A comparison of domain architecture based dendrograms using ADASS method and an existing method revealed that ADASS can classify proteins depending on the extent of domain architecture level similarity. ADASS is more relevant in cases of proteins with tiny domains having little contribution to the overall sequence similarity but contributing significantly to the overall function.  相似文献   

19.
Switchgrass (Panicum virgatum L.) is a native perennial warm season (C4) grass that has been identified as a promising species for bioenergy research and production. Consequently, biomass yield and feedstock quality improvements are high priorities for switchgrass research. The objective of this study was to develop a switchgrass genetic linkage map using a full-sib pseudo-testcross mapping population derived from a cross between two heterozygous genotypes selected from the lowland cultivar ‘Alamo’ (AP13) and the upland cultivar ‘Summer’ (VS16). The female parent (AP13) map consists of 515 loci in 18 linkage groups (LGs) and spans 1,733 cM. The male parent (VS16) map arranges 363 loci in 17 LGs and spans 1,508 cM. No obvious cause for the lack of one LG in VS16 could be identified. Comparative analyses between the AP13 and VS16 maps showed that the two major ecotypic classes of switchgrass have highly colinear maps with similar recombination rates, suggesting that chromosomal exchange between the two ecotypes should be able to occur freely. The AP13 and VS16 maps are also highly similar with respect to marker orders and recombination levels to previously published switchgrass maps. The genetic maps will be used to identify quantitative trait loci associated with biomass and quality traits. The AP13 genotype was used for the whole genome-sequencing project and the map will thus also provide a tool for the anchoring of the switchgrass genome assembly.  相似文献   

20.
A formaldehyde denaturation map of the replicative form of phiX174 DNA is obtained. The RFI DNA was converted into a linear state by restriction endonuclease pst I which introduces into this DNA a single double-stranded break. The map has four clear-cut peaks. Their positions excellently correlate with the peak positions on the map of equilibrium denaturation theoretically obtained earlier from the known nucleotide sequence of phiX174 DNA. The sequence is also used for a calculation of the maps of smoothed AT-content. The maxima on these maps correlate well with the peaks on the denaturation maps. To reveal the causes of a good correlation between the experimental formaldehyde and theoretical equilibrium denaturation maps, the theoretical formaldehyde denaturation maps are calculated for different conditions (temperature, formaldehyde concentration) using the detailed theory of DNA interaction with formaldehyde developed earlier.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号