首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Shannon's definition of uncertainty or surprisal has been applied extensively to measure the information content of aligned DNA sequences and characterizing DNA binding sites. In contrast to Shannon's uncertainty, this study investigates the applicability and suitability of a parametric uncertainty measure due to Rényi. It is observed that this measure also provides results in agreement with Shannon's measure, pointing to its utility in analysing DNA binding site region. For facilitating the comparison between these uncertainty measures, a dimensionless quantity called "redundancy" has been employed. It is found that Rényi's measure at low parameter values possess a better delineating feature of binding sites (of binding regions) than Shannon's measure. The critical value of the parameter is chosen with an outlier criterion.  相似文献   

2.
The relationship between information and energy is key to understanding biological systems. We can display the information in DNA sequences specifically bound by proteins by using sequence logos, and we can measure the corresponding binding energy. These can be compared by noting that one of the forms of the second law of thermodynamics defines the minimum energy dissipation required to gain one bit of information. Under the isothermal conditions that molecular machines function this is joules per bit ( is Boltzmann''s constant and T is the absolute temperature). Then an efficiency of binding can be computed by dividing the information in a logo by the free energy of binding after it has been converted to bits. The isothermal efficiencies of not only genetic control systems, but also visual pigments are near 70%. From information and coding theory, the theoretical efficiency limit for bistate molecular machines is ln 2 = 0.6931. Evolutionary convergence to maximum efficiency is limited by the constraint that molecular states must be distinct from each other. The result indicates that natural molecular machines operate close to their information processing maximum (the channel capacity), and implies that nanotechnology can attain this goal.  相似文献   

3.
8-Anilino-1-naphthalene sulfonate (ANS) and its covalent dimer bis-ANS are widely used for titrating hydrophobic surfaces of proteins. Interest to understand the nature of interaction of these dyes with proteins was seriously pursued. However as the techniques used in these studies varied, they often provided varied information regarding stoichiometry, binding affinity, actual binding sites etc. In the present study, we used combination of computation methods (docking and MD simulation) and experimental methods (mutations, steady-state and time-resolved fluorescence) to investigate bis-ANS interaction with Bacillus subtilis lipase. We identified seven binding sites for bis-ANS on lipase using computational docking and MD simulation and verified these data using a set of single amino acid substituted mutants. Docking and MD simulation studies indicated that the binding sites were various indentations and grooves on protein surface with hydrophobic characteristics. Both hydrophobic and ionic interactions were involved in each of these binding events. We further examine the fluorescence properties of bis-ANS bound to mutant lipases that either gained or lost a binding site. Our results indicated that neither gain nor loss of single binding site caused any change in fluorescence lifetimes (and their relative amplitudes) of mutant lipase-bound bis-ANS in comparison to that bound to wild type; hence, it suggested that nature of bis-ANS binding to each of the sites in lipase was very similar.  相似文献   

4.
Summary A genetic algorithm (GA) based method for docking ensembles of small, flexible ligands to receptor proteins using NMR-derived constraints is described. In this method, three translations and rotations of the ligand and the dihedral angles of the ligand are represented by binary strings and evolve under the genetic operators of cross-over, mutation, migration and selection. The fitness function for the selection process includes distance and dihedral restraints and a repulsive van der Waals term. The GA was applied to a three-atom model system as well as to the streptavidin-biotin complex using simulated intermolecular distance restraints. In both systems, the GA was able to obtain low-energy conformations when only a single binding site was simulated. Calculations were also performed using distance restraints from two distinct binding sites. In these simulations, the GA was able to obtain low-energy conformations corresponding to ligand molecules in each of the two sites. The inclusion of additional ligands in the ensemble did not result in an energetic benefit, confirming that only two ligand conformations were necessary to fulfill the distance restraints. This method allows for a direct investigation of the minimum number of ligand orientations necessary to fulfill experimental distance restraints, and simultaneously yields detailed structural information about each site.  相似文献   

5.
Statistical methodology for the identification and characterization of protein binding sites in a set of unaligned DNA fragments is presented. Each sequence must contain at least one common site. No alignment of the sites is required. Instead, the uncertainty in the location of the sites is handled by employing the missing information principle to develop an "expectation maximization" (EM) algorithm. This approach allows for the simultaneous identification of the sites and characterization of the binding motifs. The reliability of the algorithm increases with the number of fragments, but the computations increase only linearly. The method is illustrated with an example, using known cyclic adenosine monophosphate receptor protein (CRP) binding sites. The final motif is utilized in a search for undiscovered CRP binding sites.  相似文献   

6.
In 1972, Haseman and Elston proposed a pioneering regression method for mapping quantitative trait loci using randomly selected sib pairs. Recently, the statistical power of their method was shown to be increased when extremely discordant sib pairs are ascertained. While the precise genetic model may not be known, prior information that constrains IBD probabilities is often available. We investigate properties of tests that are robust against model uncertainty and show that the power gain from further constraining IBD probabilities is marginal. The additional linkage information contained in the trait values can be incorporated by combining the Haseman-Elston regression method and a robust allele sharing test.  相似文献   

7.
刘陈坚  张黎明  任引 《生态学报》2020,40(22):8199-8206
森林生物量会直接影响森林生态系统服务的评估。如何运用景感生态学,准确预测区域尺度下森林生物量的时空演变趋势,是关乎国家重大方针政策制定和生态产业体系建设的关键性战略课题。本研究目的是构建一套生态信息诊断框架,优化趋善化模型(3PG2模型)结构,解决由于模型结构设计所导致在森林景感营造过程中生态预测的不确定性。以杉木林分布广泛的福建南靖县为研究区域,选择合适的阈值范围和空间统计分析识别出模拟生物量的不确定性区域,构建包含Geogdetector软件、遗传技术和计算机程序3个部分组成的生态信息诊断框架,使用Geogdetector软件阐明多重因素交互作用对模型模拟的影响及机理,采用遗传技术优化模型结构以提升模拟精度,运用计算机程序和3PG2模型准确预测区域尺度杉木林生物量的时空演变趋势。结果表明:林龄是导致3PG2模型生物量模拟结果不确定性的主导因素。通过景感生态学(谜码数据和趋善化模型)构建的生态信息诊断框架可以准确预测森林生物量,实现区域尺度上的可持续森林管理。  相似文献   

8.
A major issue faced by breeders is how to effectively manage adverse correlations in breeding programs. We present results of a Monte Carlo allele-based simulation of the changes in response and variance of response under adverse genetic correlations by using the examples of two contrasting selection methods: the ‘Smith-Hazel’ selection index (SH) and independent culling (IC). We assumed several gene models, which included linkage and antagonistic pleiotropy as the primary drivers of adverse genetic correlations. The different behaviors of these selection methods allowed us to identify the mechanism behind the generation of uncertainty under antagonistic trait selection: IC had the properties of stabilizing selection, while SH behaved more similar to disruptive selection. Although SH outperformed IC in terms of genetic gain, this advantage happened at the cost of higher variance of response and loss of heterozygosity. Using an optimum selection algorithm (OS) to prevent the loss of heterozygosity through a constraint on inbreeding in SH/OS increased marginally the reliability, remaining still below that of IC under equal conditions. However, SH/OS had lower inbreeding (ΔF) than IC for equivalent levels of genetic gain, so a compromise between high selection reliability, low ΔF, and gain must be made by a breeder under antagonistic trait selection even with the use of optimization tools.  相似文献   

9.
The statistics of base-pair usage within known recognition sites for a particular DNA-binding protein can be used to estimate the relative protein binding affinities to these sites, as well as to sites containing any other combinations of base-pairs. As has been described elsewhere, the connection between base-pair statistics and binding free energy is made by an equal probability selection assumption; i.e. that all base-pair sequences that provide appropriate binding strength are equally likely to have been chosen as recognition sites in the course of evolution. This is analogous to a statistical-mechanical system where all configurations with the same energy are equally likely to occur. In this communication, we apply the statistical-mechanical selection theory to analyze the base-pair statistics of the known recognition sequences for the cyclic AMP receptor protein (CRP). The theoretical predictions are found to be in reasonable agreement with binding data for those sequences for which experimental binding information is available, thus lending support to the basic assumptions of the selection theory. On the basis of this agreement, we can predict the affinity for CRP binding to any base-pair sequence, albeit with a large statistical uncertainty. When the known recognition sites for CRP are ranked according to predicted binding affinities, we find that the ranking is consistent with the hypothesis that the level of function of these sites parallels their fractional saturation with CRP-cAMP under in-vivo conditions. When applied to the entire genome, the theory predicts the existence of a large number of randomly occurring "pseudosites" with strong binding affinity for CRP. It appears that most CRP molecules are engaged in non-productive binding at non-specific or pseudospecific sites under in-vivo conditions. In this sense, the specificity of the CRP binding site is very low. Relative specificity requirements for polymerases, repressors and activators are compared in light of the results of this and the first paper in this series.  相似文献   

10.
11.
Many eukaryotic secretory proteins are selected for export from the endoplasmic reticulum (ER) through their interaction with the Sec24p subunit of the coat protein II (COPII) coat. Three distinct cargo‐binding sites on yeast Sec24p have been described by biochemical, genetic and structural studies. Each site recognizes a limited set of peptide motifs or a folded structural domain, however, the breadth of cargo recognized by a given site and the dynamics of cargo engagement remain poorly understood. We aimed to gain further insight into the broader molecular function of one of these cargo‐binding sites using a non‐biased genetic approach. We exploited the in vivo lethality associated with mutation of the Sec24p B‐site to identify genes that suppress this phenotype when overexpressed. We identified SMY2 as a general suppressor that rescued multiple defects in Sec24p, and SEC22 as a specific suppressor of two adjacent cargo‐binding sites, raising the possibility of allosteric regulation of these domains. We generated a novel set of mutations in Sec24p that distinguish these two sites and examined the ability of Sec22p to rescue these mutations. Our findings suggest that co‐operativity does not influence cargo capture at these sites, and that Sec22p rescue occurs via its function as a retrograde SNARE.  相似文献   

12.
Information analysis of Fis binding sites.   总被引:15,自引:6,他引:9       下载免费PDF全文
Originally discovered in the bacteriophage Mu DNA inversion system gin, Fis (Factor for Inversion Stimulation) regulates many genetic systems. To determine the base frequency conservation required for Fis to locate its binding sites, we collected a set of 60 experimentally defined wild-type Fis DNA binding sequences. The sequence logo for Fis binding sites showed the significance and likely kinds of base contacts, and these are consistent with available experimental data. Scanning with an information theory based weight matrix within fis, nrd, tgt/sec and gin revealed Fis sites not previously identified, but for which there are published footprinting and biochemical data. DNA mobility shift experiments showed that a site predicted to be 11 bases from the proximal Salmonella typhimurium hin site and a site predicted to be 7 bases from the proximal P1 cin site are bound by Fis in vitro. Two predicted sites separated by 11 bp found within the nrd promoter region, and one in the tgt/sec promoter, were also confirmed by gel shift analysis. A sequence in aldB previously reported to be a Fis site, for which information theory predicts no site, did not shift. These results demonstrate that information analysis is useful for predicting Fis DNA binding.  相似文献   

13.
We reasoned that mating animals by minimising the covariance between ancestral contributions (MCAC mating) will generate less inbreeding and at least as much genetic gain as minimum-coancestry mating in breeding schemes where the animals are truncation-selected. We tested this hypothesis by stochastic simulation and compared the mating criteria in hierarchical and factorial breeding schemes, where the animals were selected based on breeding values predicted by animal-model BLUP. Random mating was included as a reference-mating criterion. We found that MCAC mating generated 4% to 8% less inbreeding than minimum-coancestry mating in the hierarchical and factorial breeding schemes without any loss in genetic gain. Moreover, it generated upto 28% less inbreeding and about 3% more genetic gain than random mating. The benefits of MCAC mating over minimum-coancestry mating are worthwhile because they can be achieved without extra costs or practical constraints. MCAC mating merely uses pedigree information to pair the animals more appropriately and is clearly a worthy alternative to minimum-coancestry mating and probably any other mating criterion. We believe, therefore, that MCAC mating should be used in breeding schemes where pedigree information is available.  相似文献   

14.
Variances for general combining ability (GCA) and specific combining ability (SCA) and the relationship between mid-parental GCA and SCA effects were estimated for tree diameter (DBH) from a series of 20 sets of 6×6 half-diallel mating experiments in radiata pine, planted at ten sites across Australia. Significant SCA variance for DBH was almost equal to GCA variance for the combined analysis of all ten sites. The importance of SCA variance varied among sites, from non-significant to SCA variance accounting for all genetic variation among full-sib families. Significant SCA × site interaction was detected among the ten sites. A significant and positive correlation between mid-parental breeding values and best linear unbiased predictions of the SCA effects was observed. About a quarter of extra genetic gain is achievable through use of SCA variance if selection is based on the best breeding values. To fully exploit genetic gain from SCA variance in a deployment population, positive assortative matings are required for the best parents. It is estimated that the additional deployment gain of 46.0% for ten sites combined, or 52.9% for four sites combined that had significant GCA as well as SCA effects, were achievable relative to gain from GCA only, if all SCA variance within this breeding population was exploited. For a breeding population, selection for breeding values may be sufficient due to positive correlations between breeding values and SCA values. For a deployment population to capture more SCA genetic gain, it is preferable to make more pair-wise mating for parents with higher breeding values.Communicated by O. Savolainen  相似文献   

15.
The Sleeping Beauty (SB) transposon is the most widely used DNA transposon in genetic applications and is the only DNA transposon thus far in clinical trials for human gene therapy. In the absence of atomic level structural information, the development of SB transposon relied primarily on the biochemical and genetic homology data. While these studies were successful and have yielded hyperactive transposases, structural information is needed to gain a mechanistic understanding of transposase activity and guides to further improvement. We have initiated a structural study of SB transposase using Nuclear Magnetic Resonance (NMR) and Circular Dichroism (CD) spectroscopy to investigate the properties of the DNA‐binding domain of SB transposase in solution. We show that at physiologic salt concentrations, the SB DNA‐binding domain remains mostly unstructured but its N‐terminal PAI subdomain forms a compact, three‐helical structure with a helix‐turn‐helix motif at higher concentrations of NaCl. Furthermore, we show that the full‐length SB DNA‐binding domain associates differently with inner and outer binding sites of the transposon DNA. We also show that the PAI subdomain of SB DNA‐binding domain has a dominant role in transposase's attachment to the inverted terminal repeats of the transposon DNA. Overall, our data validate several earlier predictions and provide new insights on how SB transposase recognizes transposon DNA.  相似文献   

16.
17.
M L Doyle  J H Simmons  S J Gill 《Biopolymers》1990,29(8-9):1129-1135
Examination of binding information in the form of derivative (or finite difference) measurements is explored (1) experimentally by a thin-layer optical procedure (Dolman, D. & Gill, S. J. (1978) Anal. Biochem. 87, 127-134) and (2) theoretically by simulation in order to determine the influence of the number of data points and their standard error upon the resolvability of binding parameters in cooperative and non-cooperative systems. The data is described by the difference in optical absorbance divided by the change in the logarithm of the ligand activity and each data point is assumed to be influenced by a random error with a given variance. It is found that increasing the number of data points, which in turn effectively reduces the magnitude of the observed absorbance changes, results in an increase in the uncertainty of the resolved parameters of the system. The effect is verified by both experimental and simulation studies. Thus one is led to suggest that fewer measurements for the change of absorbance with larger magnitudes produces the most favorable situation for parameter resolution when the data is in the form of finite difference measurements.  相似文献   

18.
J F Brandts  L N Lin 《Biochemistry》1990,29(29):6927-6940
Data from differential scanning calorimetry (DSC) may be used to estimate very large binding constants that cannot be conveniently measured by more conventional equilibrium techniques. Thermodynamic models have been formulated to describe interacting systems that involve either one thermal transition (protein-ligand) or two thermal transitions (protein-protein) and either 1:1 or higher binding stoichiometry. Methods are described for obtaining binding constants and heats of binding by two different methods: calculation or simulation fitting of data. Extensive DSC data on 2'CMP binding to RNase are presented and analyzed by the two methods. It is found that the methods agree when binding sites are completely saturated, but substantial errors arise in the calculation method when site saturation is incomplete and the transition of liganded molecules overlaps that of unliganded molecules. This arises primarily from an inability to determine TM (i.e., the temperature where concentrations of folded and unfolded protein are equal) under weak-binding conditions. Results from simulation show that the binding constants and heats of binding from the DSC method agree quantitatively with corresponding estimates obtained from equilibrium methods when extrapolated to the same temperature. It was also found from the DSC data that the binding constant decreases with increasing concentration of ligand, which might arise from nonideality effects associated with dimerization of 2'CMP. Simulations show that the DSC method is capable of estimating binding constants for ultratight interactions up to perhaps 10(40) M-1 or higher, while most equilibrium methods fail well below 10(10) M-1. DSC data from the literature on a number of interacting systems (trypsin-soybean trypsin inhibitor, trypsin-ovomucoid, trypsin-pancreatic trypsin inhibitor, chymotrypsin-subtilisin inhibitor, subtilisin BPN-subtilisin inhibitor, RNase S protein-RNase S peptide, avidin-biotin, ovotransferrin-Fe3+, superoxide dismutase-Zn2+, alkaline phosphatase-Zn2+, and assembly of regulatory and catalytic subunits of aspartate transcarbamoylase) were analyzed by simulation fitting or by calculation. Apparent single-site binding constants ranged from ca. 10(5) to 10(20) M-1, while the interaction constant for assembly of aspartate transcarbamoylase was estimated as 10(37) in molarity units. For most of these systems, the DSC interaction constants compared favorably with other literature estimates, for some it did not for reasons unknown, while for still others this represented the first estimate. Simulations show that for proteins having two binding sites for the same ligand within a single cooperative unit, ligand rearrangement will occur spontaneously during a DSC scan as the transition temperature of the unliganded protein is approached.(ABSTRACT TRUNCATED AT 400 WORDS)  相似文献   

19.
Material Flow Analysis (MFA) is a useful method for modeling, understanding, and optimizing sociometabolic systems. Among others, MFAs can be distinguished by two general system properties: First, they differ in their complexity, which depends on system structure and size. Second, they differ in their inherent uncertainty, which arises from limited data quality. In this article, uncertainty and complexity in MFA are approached from a systems perspective and expressed as formally linked phenomena. MFAs are, in a graph‐theoretical sense, understood as networks. The uncertainty and complexity of these networks are computed by use of information measures from the field of theoretical ecology. The size of a system is formalized as a function of its number of flows. It defines the potential information content of an MFA system and holds as a reference against which complexity and uncertainty are gauged. Integrating data quality measures, the uncertainty of an MFA before and after balancing is determined. The actual information content of an MFA is measured by relating its uncertainty to its potential information content. The complexity of a system is expressed based on the configuration of each individual flow in relation to its neighboring flows. The proposed metrics enable different material flow systems to be compared to one another and the role of individual flows within a system to be assessed. They provide information useful for the design of MFAs and for the communication of MFA results. For exemplification, the regional MFAs of aluminum and plastics in Austria are analyzed in this article.  相似文献   

20.
Chan JY 《Bio Systems》2012,108(1-3):28-33
Recent evidence supports the existence of a mutator phenotype in cancer cells, although the mechanistic basis remains unknown. In this paper, it is shown that this enhanced genetic instability is generated by an amplified measurement uncertainty on genetic information during DNA replication. At baseline, an inherent measurement uncertainty implies an imprecision of the recognition, replication and transfer genetic information, and forms the basis for an intrinsic genetic instability in all biological cells. Genetic information is contained in the sequence of DNA bases, each existing due to proton tunnelling, as a coherent superposition of quantum states composed of both the canonical and rare tautomeric forms until decoherence by interaction with DNA polymerase. The result of such a quantum measurement process may be interpreted classically as akin to a Bernoulli trial, whose outcome X is random and can be either of two possibilities, depending on whether the proton is tunnelled (X=1) or not (X=0). This inherent quantum uncertainty is represented by a binary entropy function and quantified in terms of Shannon information entropy H(X)=-P(X=1)log(2)P(X=1)-P(X=0)log(2)P(X=0). Enhanced genetic instability may either be directly derived from amplified uncertainty induced by increases in quantum and thermodynamic fluctuation, or indirectly arise from the loss of natural uncertainty reduction mechanisms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号