共查询到20条相似文献,搜索用时 0 毫秒
1.
Nonparametric methods for incorporating genomic information into genetic evaluations: an application to mortality in broilers
下载免费PDF全文

Four approaches using single-nucleotide polymorphism (SNP) information (F(infinity)-metric model, kernel regression, reproducing kernel Hilbert spaces (RKHS) regression, and a Bayesian regression) were compared with a standard procedure of genetic evaluation (E-BLUP) of sires using mortality rates in broilers as a response variable, working in a Bayesian framework. Late mortality (14-42 days of age) records on 12,167 progeny of 200 sires were precorrected for fixed and random (nongenetic) effects used in the model for genetic evaluation and for the mate effect. The average of the corrected records was computed for each sire. Twenty-four SNPs seemingly associated with late mortality were included in three methods used for genomic assisted evaluations. One thousand SNPs were included in the Bayesian regression, to account for markers along the whole genome. The posterior mean of heritability of mortality was 0.02 in the E-BLUP approach, suggesting that genetic evaluation could be improved if suitable molecular markers were available. Estimates of posterior means and standard deviations of the residual variance were 24.38 (3.88), 29.97 (3.22), 17.07 (3.02), and 20.74 (2.87) for E-BLUP, the linear model on SNPs, RKHS regression, and the Bayesian regression, respectively, suggesting that RKHS accounted for more variance in the data. The two nonparametric methods (kernel and RKHS regression) fitted the data better, having a lower residual sum of squares. Predictive ability, assessed by cross-validation, indicated advantages of the RKHS approach, where accuracy was increased from 25 to 150%, relative to other methods. 相似文献
2.
Abstract
The extended Kalman filter (EKF) has been applied to inferring gene regulatory networks. However, it is well known that the EKF becomes less accurate when the system exhibits high nonlinearity. In addition, certain prior information about the gene regulatory network exists in practice, and no systematic approach has been developed to incorporate such prior information into the Kalman-type filter for inferring the structure of the gene regulatory network. In this paper, an inference framework based on point-based Gaussian approximation filters that can exploit the prior information is developed to solve the gene regulatory network inference problem. Different point-based Gaussian approximation filters, including the unscented Kalman filter (UKF), the third-degree cubature Kalman filter (CKF3), and the fifth-degree cubature Kalman filter (CKF5) are employed. Several types of network prior information, including the existing network structure information, sparsity assumption, and the range constraint of parameters, are considered, and the corresponding filters incorporating the prior information are developed. Experiments on a synthetic network of eight genes and the yeast protein synthesis network of five genes are carried out to demonstrate the performance of the proposed framework. The results show that the proposed methods provide more accurate inference results than existing methods, such as the EKF and the traditional UKF.3.
A topological framework for the computation of the HOMFLY polynomial and its application to proteins
Polymers can be modeled as open polygonal paths and their closure generates knots. Knotted proteins detection is currently achieved via high-throughput methods based on a common framework insensitive to the handedness of knots. Here we propose a topological framework for the computation of the HOMFLY polynomial, an handedness-sensitive invariant. Our approach couples a multi-component reduction scheme with the polynomial computation. After validation on tabulated knots and links the framework was applied to the entire Protein Data Bank along with a set of selected topological checks that allowed to discard artificially entangled structures. This led to an up-to-date table of knotted proteins that also includes two newly detected right-handed trefoil knots in recently deposited protein structures. The application range of our framework is not limited to proteins and it can be extended to the topological analysis of biological and synthetic polymers and more generally to arbitrary polygonal paths. 相似文献
4.
Alongside the well-studied membrane spanning helices, alpha-helical transmembrane (TM) proteins contain several functionally and structurally important types of substructures. Here, existing 3D structures of transmembrane proteins have been used to define and study the concept of reentrant regions, i.e. membrane penetrating regions that enter and exit the membrane on the same side. We find that these regions can be divided into three distinct categories based on secondary structure motifs, namely long regions with a helix-coil-helix motif, regions of medium length with the structure helix-coil or coil-helix and regions of short to medium length consisting entirely of irregular secondary structure. The residues situated in reentrant regions are significantly smaller on average compared to other regions and reentrant regions can be detected in the inter-transmembrane loops with an accuracy of approximately 70% based on their amino acid composition. Using TOP-MOD, a novel method for predicting reentrant regions, we have scanned the genomes of Escherichia coli, Saccharomyces cerevisiae and Homo sapiens. The results suggest that more than 10% of transmembrane proteins contain reentrant regions and that the occurrence of reentrant regions increases linearly with the number of transmembrane regions. Reentrant regions seem to be most commonly found in channel proteins and least commonly in signal receptors. 相似文献
5.
The paper concerns the practical realization of the maximum topologic similarity principle for phylogenetic reconstruction. This novel principle is described in the accompanying paper. Two algorithms that were embodied in the computer program allow one to find out the unique tree in case when source data admit the existence of such tree. In case if numerous parallel mutations make such precise realization impossible, algorithms allow one to obtain approximations to the maximum topologic similarity trees with a high computation efficiency. Examples illustrating use of these algorithms, as well as discussion of biological consistency of the novel concept are presented. 相似文献
6.
7.
8.
Christopher K Edlund Won H Lee Dalin Li David J Van Den Berg David V Conti 《BMC bioinformatics》2008,9(1):174
Background
There has been considerable effort focused on developing efficient programs for tagging single-nucleotide polymorphisms (SNPs). Many of these programs do not account for potential reduced genomic coverage resulting from genotyping failures nor do they preferentially select SNPs based on functionality, which may be more likely to be biologically important. 相似文献9.
MOTIVATION: We review proposed syntheses of probabilistic sequence alignment, profiling and phylogeny. We develop a multiple alignment algorithm for Bayesian inference in the links model proposed by Thorne et al. (1991, J. Mol. Evol., 33, 114-124). The algorithm, described in detail in Section 3, samples from and/or maximizes the posterior distribution over multiple alignments for any number of DNA or protein sequences, conditioned on a phylogenetic tree. The individual sampling and maximization steps of the algorithm require no more computational resources than pairwise alignment. METHODS: We present a software implementation (Handel) of our algorithm and report test results on (i) simulated data sets and (ii) the structurally informed protein alignments of BAliBASE (Thompson et al., 1999, Nucleic Acids Res., 27, 2682-2690). RESULTS: We find that the mean sum-of-pairs score (a measure of residue-pair correspondence) for the BAliBASE alignments is only 13% lower for Handelthan for CLUSTALW(Thompson et al., 1994, Nucleic Acids Res., 22, 4673-4680), despite the relative simplicity of the links model (CLUSTALW uses affine gap scores and increased penalties for indels in hydrophobic regions). With reference to these benchmarks, we discuss potential improvements to the links model and implications for Bayesian multiple alignment and phylogenetic profiling. AVAILABILITY: The source code to Handelis freely distributed on the Internet at http://www.biowiki.org/Handel under the terms of the GNU Public License (GPL, 2000, http://www.fsf.org./copyleft/gpl.html). 相似文献
10.
Background
Due to their role of receptors or transporters, membrane proteins play a key role in many important biological functions. In our work we used Grammatical Inference (GI) to localize transmembrane segments. Our GI process is based specifically on the inference of Even Linear Languages. 相似文献11.
Heterotrimeric GTP-binding proteins (G proteins) that are made up of alpha and beta gamma subunits couple many kinds of cell-surface receptors to intracellular effector enzymes or ion channels. Every cell contains several types of receptors, G proteins, and effectors. The specificity with which G protein subunits interact with receptors and effectors defines the range of responses a cell is able to make to an external signal. Thus, the G proteins act as a critical control point that determines whether a signal spreads through several pathways or is focused to a single pathway. In this review, I will summarize some features of the structure and function of mammalian G protein subunits, discuss the role of both alpha and beta gamma subunits in regulation of effectors, the role of the beta gamma subunit in macromolecular assembly, and the mechanisms that might make some responses extremely specific and others rather diffuse. 相似文献
12.
MemType-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM 总被引:1,自引:0,他引:1
Given an uncharacterized protein sequence, how can we identify whether it is a membrane protein or not? If it is, which membrane protein type it belongs to? These questions are important because they are closely relevant to the biological function of the query protein and to its interaction process with other molecules in a biological system. Particularly, with the avalanche of protein sequences generated in the Post-Genomic Age and the relatively much slower progress in using biochemical experiments to determine their functions, it is highly desired to develop an automated method that can be used to help address these questions. In this study, a 2-layer predictor, called MemType-2L, has been developed: the 1st layer prediction engine is to identify a query protein as membrane or non-membrane; if it is a membrane protein, the process will be automatically continued with the 2nd-layer prediction engine to further identify its type among the following eight categories: (1) type I, (2) type II, (3) type III, (4) type IV, (5) multipass, (6) lipid-chain-anchored, (7) GPI-anchored, and (8) peripheral. MemType-2L is featured by incorporating the evolution information through representing the protein samples with the Pse-PSSM (Pseudo Position-Specific Score Matrix) vectors, and by containing an ensemble classifier formed by fusing many powerful individual OET-KNN (Optimized Evidence-Theoretic K-Nearest Neighbor) classifiers. The success rates obtained by MemType-2L on a new-constructed stringent dataset by both the jackknife test and the independent dataset test are quite high, indicating that MemType-2L may become a very useful high throughput tool. As a Web server, MemType-2L is freely accessible to the public at http://chou.med.harvard.edu/bioinf/MemType. 相似文献
13.
Hurwitz N Pellegrini-Calace M Jones DT 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2006,361(1467):465-475
In this paper we briefly review some of the recent progress made by ourselves and others in developing methods for predicting the structures of transmembrane proteins from amino acid sequence. Transmembrane proteins are an important class of proteins involved in many diverse biological functions, many of which have great impact in terms of disease mechanism and drug discovery. Despite their biological importance, it has proven very difficult to solve the structures of these proteins by experimental techniques, and so there is a great deal of pressure to develop effective methods for predicting their structure. The methods we discuss range from methods for transmembrane topology prediction to new methods for low resolution folding simulations in a knowledge-based force field. This potential is designed to reproduce the properties of the lipid bilayer. Our eventual aim is to apply these methods in tandem so that useful three-dimensional models can be built for a large fraction of the transmembrane protein domains in whole proteomes. 相似文献
14.
Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins. 总被引:34,自引:2,他引:34
下载免费PDF全文

A Bateman E Birney R Durbin S R Eddy R D Finn E L Sonnhammer 《Nucleic acids research》1999,27(1):260-262
Pfam is a collection of multiple alignments and profile hidden Markov models of protein domain families. Release 3.1 is a major update of the Pfam database and contains 1313 families which are available on the World Wide Web in Europe at http://www.sanger.ac.uk/Software/Pfam/ and http://www.cgr.ki.se/Pfam/, and in the US at http://pfam.wustl.edu/. Over 54% of proteins in SWISS-PROT-35 and SP-TrEMBL-5 match a Pfam family. The primary changes of Pfam since release 2.1 are that we now use the more advanced version 2 of the HMMER software, which is more sensitive and provides expectation values for matches, and that it now includes proteins from both SP-TrEMBL and SWISS-PROT. 相似文献
15.
A set of pairwise contact potentials between amino acid residues in transmembrane helices was determined from the known native structure of the transmembrane protein (TMP) bacteriorhodopsin by the method of perceptron learning, using Monte Carlo dynamics to generate suitable "decoy" structures. The procedure of finding these decoys is simpler than for globular proteins, since it is reasonable to assume that helices behave as independent, stable objects and, therefore, the search in the conformational space is greatly reduced. With the learnt potentials, the association of the helices in bacteriorhodopsin was successfully simulated. The folding of a second TMP (the helix-dimer glycophorin A) was then accomplished with only a refinement of the potentials from a small number of decoys. 相似文献
16.
A novel procedure has been developed to specifically label the cytoplasmic domains of transmembrane proteins with the aldehyde pyridoxal 5-phosphate (PLP). Torpedo californica acetylcholine receptor (AcChR) vesicles were loaded with [3H]pyridoxine 5-phosphate ([3H]PNP) and pyridoxine-5-phosphate oxidase, followed by intravesicular enzymatic oxidation of [3H]PNP at 37 degrees C in the presence of externally added cytochrome c as a scavenger of possible leaking PLP product. The resulting Schiff's bases between PLP and AcChR amino groups were reduced with NaCNBH3, and the pyridoxylated proteins were analyzed by fluorography. The four receptor subunits were labeled whether the reaction was carried out on the internal surface or separately designed to mark the external one. On the other hand, the relative pyridoxylation of the subunits differed in both cases, reflecting differences in accessible lysyl residues in each side of the membrane. Proteinase K treatment of labeled AcChR vesicles generated a peptide of 13 kDa that could be detected with anti-PLP antibodies only when the pyridoxylation was carried out on the internal surface of the vesicles. Even though there are no large differences in the total lysine content among the subunits and there are two copies of the alpha-subunit, internal surface labeling by PLP was greatest for the highest molecular weight (delta) subunit, reinforcing the concept that the four receptor subunits are transmembranous and may protrude into the cytoplasmic face in a fashion [Strader, C. D., & Raftery, M. A. (1980) Proc. Natl. Acad. Sci. U.S.A. 77, 5807-5811] that is proportional to their subunit molecular weight.(ABSTRACT TRUNCATED AT 250 WORDS) 相似文献
17.
《The Journal of cell biology》1993,121(2):317-333
A COOH-terminal double lysine motif maintains type I transmembrane proteins in the ER. Proteins tagged with this motif, eg., CD8/E19 and CD4/E19, rapidly receive post-translational modifications characteristic of the intermediate compartment and partially colocalized to this organelle. These proteins also received modifications characteristic of the Golgi but much more slowly. Lectin staining localized these Golgi modified proteins to ER indicating that this motif is a retrieval signal. Differences in the subcellular distribution and rate of post-translational modification of CD8 maintained in the ER by sequences derived from a variety of ER resident proteins suggested that the efficiency of retrieval was dependent on the sequence context of the double lysine motif and that retrieval may be initiated from multiple positions along the exocytotic pathway. 相似文献
18.
MOTIVATION: The best quality multiple sequence alignments are generally considered to derive from structural superposition. However, no previous work has studied the relative performance of profile hidden Markov models (HMMs) derived from such alignments. Therefore several alignment methods have been used to generate multiple sequence alignments from 348 structurally aligned families in the HOMSTRAD database. The performance of profile HMMs derived from the structural and sequence-based alignments has been assessed for homologue detection. RESULTS: The best alignment methods studied here correctly align nearly 80% of residues with respect to structure alignments. Alignment quality and model sensitivity are found to be dependent on average number, length, and identity of sequences in the alignment. The striking conclusion is that, although structural data may improve the quality of multiple sequence alignments, this does not add to the ability of the derived profile HMMs to find sequence homologues. SUPPLEMENTARY INFORMATION: A list of HOMSTRAD families used in this study and the corresponding Pfam families is available at http://www.sanger.ac.uk/Users/sgj/alignments/map.html Contact: sgj@sanger.ac.uk 相似文献
19.
20.
INSIGs are proteins that underlie sterol regulation of the mammalian proteins SCAP (SREBP cleavage activating protein) and HMG-CoA reductase (HMGR). The INSIGs perform distinct tasks in the regulation of these effectors: they promote ER retention of SCAP, but ubiquitin-mediated degradation of HMGR. Two questions that arise from the discovery and study of INSIGs are: how do they perform these distinct tasks, and how general are the actions of INSIGs in biology? We now show that the yeast INSIG homologs NSG1 and NSG2 function to control the stability of yeast Hmg2p, the HMGR isozyme that undergoes regulated ubiquitination. Yeast Nsgs inhibit degradation of Hmg2p in a highly specific manner, by directly interacting with the sterol-sensing domain (SSD)-containing transmembrane region. Nsg1p functions naturally to limit degradation of Hmg2p when both proteins are at native levels, indicating a long-standing functional interplay between these two classes of proteins. One way to unify the known, disparate actions of INSIGs is to view them as known adaptations of a chaperone dedicated to SSD-containing client proteins. 相似文献