共查询到20条相似文献,搜索用时 15 毫秒
1.
P Broquet A Martin M J Peschard H Baubichon-Cortay M Serres-Guillaumond P Louisot 《The International journal of biochemistry》1987,19(7):653-656
Circadian variations of the acetylcholine muscarinic receptor and some glycosyltransferases were studied in brain using multivariate analysis. Highly significant correlations exist between fucosyltransferase, sialyltransferase and galactosyltransferase and to a lesser extent between both of these enzymes and acetylcholine receptor. No correlation appeared between these enzymes and dolichol phosphate mannose synthase. 相似文献
2.
Fold recognition was applied to the systematic analysis of the all sequences encoded by the genome of Mycoplasma tuberculosis H37Rv in order to identify new putative glycosyltransferases. The search was conducted against a library composed of all known crystal structures of glycosyltransferases and some related proteins. A clear relationship appeared between some sequences and some folds. It appears necessary to complete the fold recognition approach with a statistical approach in order to identify the relevant data above the background noise. Exploratory data analysis was carried out using several methods. Analytical methods confirmed the validity of the approach, while predictive methods, although very preliminary in the present case, allowed for identifying a number of sequences of interest that should be further investigated. This new approach of combining bioinformatics and chemometrics appears to be a powerful tool for analysis of newly sequenced genomes. Its application to glycobiology is of great interest. 相似文献
3.
Marlow AJ Fisher SE Francks C MacPhie IL Cherny SS Richardson AJ Talcott JB Stein JF Monaco AP Cardon LR 《American journal of human genetics》2003,72(3):561-570
Replication of linkage results for complex traits has been exceedingly difficult, owing in part to the inability to measure the precise underlying phenotype, small sample sizes, genetic heterogeneity, and statistical methods employed in analysis. Often, in any particular study, multiple correlated traits have been collected, yet these have been analyzed independently or, at most, in bivariate analyses. Theoretical arguments suggest that full multivariate analysis of all available traits should offer more power to detect linkage; however, this has not yet been evaluated on a genomewide scale. Here, we conduct multivariate genomewide analyses of quantitative-trait loci that influence reading- and language-related measures in families affected with developmental dyslexia. The results of these analyses are substantially clearer than those of previous univariate analyses of the same data set, helping to resolve a number of key issues. These outcomes highlight the relevance of multivariate analysis for complex disorders for dissection of linkage results in correlated traits. The approach employed here may aid positional cloning of susceptibility genes in a wide spectrum of complex traits. 相似文献
4.
OBJECTIVES: Multivariate tests for linkage can provide improved power over univariate tests but the type I error rates and comparative power of commonly used methods have not previously been compared. Here we studied the behavior of bivariate formulations of the variance component (VC) and Haseman-Elston (H-E) approaches. METHODS: We compared through simulation studies the bivariate H-E test with the unconstrained bivariate VC approach and with a VC approach in which the major-gene correlation is constrained to +/-1. We also compared these methods to univariate methods. RESULTS: Bivariate approaches are more powerful than univariate analyses unless the traits are very highly positively correlated. The power of the bivariate H-E test was less than the VC procedures. The constrained test was often less powerful than the unconstrained test. The empirical distributions of the bivariate H-E test and the unconstrained bivariate VC test conformed with asymptotic distributions for samples of 100 or more sibships of size 4. CONCLUSIONS: The unconstrained VC test is valuable for testing for preliminary linkages using multivariate phenotypes. The bivariate H-E test was less powerful than the bivariate VC tests. 相似文献
5.
Bioinformatics is a very powerful tool in the field of glycoproteomics as well as genomics and proteomics. As a part of the Glycogene Project (GG project), we have developed a novel bioinformatics system for the comprehensive identification and in silico cloning of human glycogenes. Using our system, a total of 105 candidate human glycogenes were identified and then engineered for heterologous expression. Of these candidates, 38 recombinant proteins were successfully identified for their enzyme activity and substrate specificity. We also classified 47 out of 60 carbohydrate-active enzyme glycosyltransferase families into 4 superfamilies using the profile Hidden Markov Model method. On the basis of our classification and the relationship between glycosylation pathways and superfamilies, we propose the evolution of glycosyltransferases. 相似文献
6.
7.
8.
Mehmood T Bohlin J Bråthen Kristoffersen A Sæbø S Warringer J Snipen L 《BMC bioinformatics》2012,13(1):97
ABSTRACT: BACKGROUND: Gene finding is a complicated procedure that encapsulates algorithms for coding sequence modeling, identification of promoter regions, issues concerning overlapping genes and more. In the present study we focus on coding sequence modeling algorithms; that is, algorithms for identification and prediction of the actual coding sequences from genomic DNA. In this respect, we promote a novel multivariate method known as Canonical Powered Partial Least Squares (CPPLS) as an alternative to the commonly used Interpolated Markov model (IMM). Comparisons between the methods were performed on DNA, codon and protein sequences with highly conserved genes taken from several species with different genomic properties. RESULTS: The multivariate CPPLS approach classified coding sequence substantially better than the commonly used IMM on the same set of sequences. We also found that the use of CPPLS with codon representation gave significantly better classification results than both IMM with protein (p < 0.001) and with DNA (p < 0.001). Further, although the mean performance was similar, the variation of CPPLS performance on codon representation was significantly smaller than for IMM (p < 0.001). CONCLUSIONS: The performance of coding sequence modeling can be substantially improved by using an algorithm based on the multivariate CPPLS method applied to codon or DNA frequencies. 相似文献
9.
Background
An organism's ability to adapt to its particular environmental niche is of fundamental importance to its survival and proliferation. In the largest study of its kind, we sought to identify and exploit the amino-acid signatures that make species-specific protein adaptation possible across 100 complete genomes.Results
Environmental niche was determined to be a significant factor in variability from correspondence analysis using the amino acid composition of over 360,000 predicted open reading frames (ORFs) from 17 archae, 76 bacteria and 7 eukaryote complete genomes. Additionally, we found clusters of phylogenetically unrelated archae and bacteria that share similar environments by amino acid composition clustering. Composition analyses of conservative, domain-based homology modeling suggested an enrichment of small hydrophobic residues Ala, Gly, Val and charged residues Asp, Glu, His and Arg across all genomes. However, larger aromatic residues Phe, Trp and Tyr are reduced in folds, and these results were not affected by low complexity biases. We derived two simple log-odds scoring functions from ORFs (CG) and folds (CF) for each of the complete genomes. CF achieved an average cross-validation success rate of 85 ± 8% whereas the CG detected 73 ± 9% species-specific sequences when competing against all other non-redundant CG. Continuously updated results are available at http://genome.mshri.on.ca.Conclusion
Our analysis of amino acid compositions from the complete genomes provides stronger evidence for species-specific and environmental residue preferences in genomic sequences as well as in folds. Scoring functions derived from this work will be useful in future protein engineering experiments and possibly in identifying horizontal transfer events. 相似文献10.
There are no well accepted criteria for the diagnosis of the metabolic syndrome. However, the metabolic syndrome is identified clinically by the presence of three or more of these five variables: larger waist circumference, higher triglyceride levels, lower HDL-cholesterol concentrations, hypertension, and impaired fasting glucose. We use sets of two or three variables, which are available in the Framingham Heart Study data set, to localize genes responsible for this syndrome using multivariate quantitative linkage analysis. This analysis demonstrates the applicability of using multivariate linkage analysis and how its use increases the power to detect linkage when genes are involved in the same disease mechanism. 相似文献
11.
Takamatsu S Inoue N Katsumata T Nakamura K Fujibayashi Y Takeuchi M 《Biochemistry》2005,44(16):6343-6349
Many recombinant proteins developed or under development for clinical use are glycoproteins, and trials aimed at improving their bioactivity or pharmacokinetics in vivo by altering specific glycan structures are ongoing. For pharmaceuticals of glycoproteins, it is important to characterize and, if possible, control the glycosylation profile. However, the mechanism responsible for the regulation of sugar chain structures found on naturally occurring glycoproteins is still unclear. To clarify the relationship between glycosyltransferases and sugar chain branch structure, we estimated six glycosyltransferases' activities (N-acetylglucosaminyltransferase (GlcNAcTase)-I, -II, -III, -IV, -V, and beta-1,4-galactosyltransferase (GalT)) which control the branch formation on asparagine (Asn)-linked sugar chains in 18 human cancer cell lines derived from several tissues. To visualize the balance of glycosyltransferase activity associated with each cell line, we expressed the relative glycosyltransferase activity in comparison to the average activity among the cell lines. These cell lines were classified into five groups according to their relative glycosyltransferase balance and were termed GlcNAcTase-I/-II, GlcNAcTase-III, GlcNAcTase-IV, GlcNAcTase-V, and GalT. We also characterized the structures of Asn-linked sugar chains on the cell surface of representative cell lines of each group. The branching structure of cell surface sugar chains roughly corresponded to the glycosyltransferase balance. This finding suggests that, for the sugar chain structure remodeling of glycoproteins, attention should be focused on the glycosyltransferase balance of host cells before introducing exogenous glycosyltransferases or down-regulating the activity of intrinsic glycosyltransferases. 相似文献
12.
M van Heel 《Journal of molecular biology》1991,220(4):877-887
A novel multivariate statistical approach is presented for extracting and exploiting intrinsic information present in our ever-growing sequence data banks. The information extraction from the sequences avoids the pitfalls of intersequence alignment by analyzing secondary invariant functions derived from the sequences in the data bank rather than the sequences themselves. Such typical invariant function is a 20 x 20 histogram of occurrences of amino acid pairs in a given sequence or fragment thereof. To illustrate the potential of the approach an analysis of 10,000 protein sequences from the National Biomedical Research Foundation Protein Identification Resource is presented, whose analysis already reveals great biological detail. For example, zeta-hemoglobin is found to lie close to amphibian and fish chi-hemoglobin which, in turn, is an important clue to the physiological function of this mammalian early embryonic hemoglobin. The multivariate statistical framework presented unifies such apparently unrelated issues as phylogenetic comparisons between a set of sequences and distance matrices between the constituents of the biological sequences. The Multivariate Statistical Sequence Analysis (MSSA) principles can be used for a wide spectrum of sequence analysis problems such as: assignment of family memberships to new sequences, validation of new incoming sequences to be entered into the database, prediction of structure from sequence, discrimination of coding from non-coding DNA regions, and automatic generation of an atlas of protein or DNA sequences. The MSSA techniques represent a self-contained approach to learning continuously and automatically from the growing stream of new sequences. The MSSA approach is particularly likely to play a significant role in major sequencing efforts such as the human genome project. 相似文献
13.
We have compared the efficiency of the lod score test which assumes heterogeneity (lod2) to the standard lod score test which assumes homogeneity (lod1) when three-point linkage analysis is used in successive map intervals. If it is assumed that a gene located midway between two linked marker loci is responsible for a proportion of disease cases, then the lod1 test loses power relative to the lod2 test, as the proportion of linked families decreases, as the flanking markers are more closely linked, and as more map intervals are tested. Moreover, when multipoint analysis is used, linkage for a disease gene is more likely to be incorrectly excluded from a complete and dense linkage map if true genetic heterogeneity is ignored. We thus conclude that, in general, the lod2 linkage test is more efficient for detecting a true linkage when a complete genetic marker map is screened for a heterogeneous disorder. 相似文献
14.
15.
The inhibition of glycosyltransferases was studied using uridine monophosphate derivatives and uridine diphosphate sugar analogs. Modification in the nucleoside portion caused selective inhibition of glycosyltransferases. 相似文献
16.
Fifty-two 3D structures of Ig-like domains covering the immunoglobulin fold family (IgFF) were compared and classified according to the conservation of their secondary structures. Members of the IgFF are distantly related proteins or evolutionarily unrelated proteins with a similar fold, the Ig fold. In this paper, a multiple structural alignment of the conserved common core is described and the correlation between corresponding sequences is discussed. While the members of the IgFF exhibit wide heterogeneity in terms of tissue and species distribution or functional implications, the 3D structures of these domains are far more conserved than their sequences. We define topologically equivalent residues in the Ig-like domains, describe the hydrophobic common cores and discuss the presence of additional strands. The disulfide bridges, not necessary for the stability of the Ig fold, may have an effect on the compactness of the domains. Based upon sequence and structure analysis, we propose the introduction of two new subtypes (C3 and C4) to the previous classifications, in addition to a new global structural classification. The very low mean sequence identity between subgroups of the IgFF suggests the occurrence of both divergent and convergent evolutionary processes, explaining the wide diversity of the superfamily. Finally, this review suggest that hydrophobic residues constituting the common hydrophobic cores are important clues to explain how highly divergent sequences can adopt a similar fold. 相似文献
17.
18.
Background
Proteins that evolve from a common ancestor can change functionality over time, and it is important to be able identify residues that cause this change. In this paper we show how a supervised multivariate statistical method, Between Group Analysis (BGA), can be used to identify these residues from families of proteins with different substrate specifities using multiple sequence alignments. 相似文献19.
Shen X 《Journal of theoretical biology》2011,268(1):77-83
The glycosaminoglycan (GAG) side-chains of small leucine-rich proteoglycans have been postulated to mechanically cross-link adjacent collagen fibrils and contribute to tendon mechanics. Enzymatic depletion of tendon GAGs (chondroitin and dermatan sulfate) has emerged as a preferred method to experimentally assess this role. However, GAG removal is typically incomplete and the possibility remains that extant GAGs may remain mechanically functional. The current study specifically investigated the potential mechanical effect of the remaining GAGs after partial enzymatic digestion.A three-dimensional finite element model of tendon was created based upon the concept of proteoglycan mediated inter-fibril load sharing. Approximately 250 interacting, discontinuous collagen fibrils were modeled as having a length of 400 μm, being composed of rod elements of length 67 nm and E-modulus 1 GPa connected in series. Spatial distribution and diameters of these idealized fibrils were derived from a representative cross-sectional electron micrograph of tendon. Rod element lengths corresponded to the collagen fibril D-Period, widely accepted to act as a binding site for decorin and biglycan, the most abundant proteoglycans in tendon. Each element node was connected to nodes of any neighboring fibrils within a radius of 100 nm, the slack length of unstretched chondroitin sulfate. These GAG cross-links were the sole mechanism for lateral load sharing among the discontinuous fibrils, and were modeled as bilinear spring elements. Simulation of tensile testing of tendon with complete cross-linking closely reproduced corresponding experiments on rat tail tendons. Random reduction of 80% of GAG cross-links (matched to a conservative estimate of enzymatic depletion efficacy) predicted a drop of 14% in tendon modulus. Corresponding mechanical properties derived from experiments on rat tail tendons treated in buffer with and without chondroitinase ABC were apparently unaffected, regardless of GAG depletion. Further tests for equivalence, conservatively based on effect size limits predicted by the model, confirmed equivalent stiffness between enzymatically depleted tendons and their native controls.Although the model predicts that relatively small quantities of GAGs acting as primary collagen cross-linking elements could provide mechanical integrity to the tendon, partial enzymatic depletion of GAGs should result in mechanical changes that are not reflected in analogous experimental testing. We thus conclude that GAG side chains of small leucine-rich proteoglycans are not a primary determinant of tensile mechanical behavior in mature rat tail tendons. 相似文献
20.
cN-II class of 5' purine nucleotidases exhibit specificity for IMP/GMP and belong to the HAD (haloacid dehalogenase) superfamily of hydrolases. The recently identified ISNI class of IMP specific 5'-nucleotidases occurring in yeast, fungi and certain Plasmodia lack sequence homology with the cN-II class of enzymes. We show from analysis of motif and fold conservation that ISN1s also belong to the HAD superfamily. This identification adds a new novel member to this superfamily. 相似文献