首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
胡健  胡金娇  吕一河 《生态学报》2021,41(16):6417-6429
区域植被恢复改变了土地利用类型,从而有效控制了水土流失,但土地利用与水土流失关系的空间分异尚未明晰。整合了黄土高原坡面径流小区试验观测研究文献59篇和1121条年径流产沙记录,以8大关键带类型作为空间分层依据,采用地理探测器分析了土地利用与年径流产沙关系的空间分异。结果显示:撂荒地的年均径流量和产沙量最高分别为35.99 mm和4208.82 g/m2,撂荒地、裸地和耕地的产流产沙能力显著高于人工草地、林地、自然草地和灌丛,灌丛和林地的年均产沙量显著低于人工和自然草地(P<0.05);除了撂荒地的年均产沙量在山地森林关键带最高(16240.40 g/m2)外,在丘陵沟壑农林草交错关键带的撂荒地年均径流产沙显著高于丘陵农业-草地关键带,丘陵沟壑农林草交错关键带和丘陵农业-草地关键带裸地、耕地的产流产沙能力较高,人工草地和灌丛年均产沙量显著高于其他关键带类型(P<0.05);在山地森林关键带的林地年均径流量、径流系数和产沙量最低,分别为1.56 mm、0.41%和307.36 g/m2,而自然草地在各关键带类型都有较高的年均产流量和较低的年均产沙量;坡面径流小区的局地特征(如土地利用、面积、坡度、坡长)是影响年径流产沙关键带分异的首要因素,且存在多因子互作、非线性增强的关系。这些结果表明植被恢复能有效地保持水土,但是区域植被恢复时需要选择合适的类型,黄土丘陵沟壑区应首选自然草地、灌丛和林地。研究可为黄土高原区域植被恢复的优化配置提供科学依据。  相似文献   

2.
Proteins are generally classified into four structural classes: all-alpha proteins, all-beta proteins, alpha + beta proteins, and alpha/beta proteins. In this article, a protein is expressed as a vector of 20-dimensional space, in which its 20 components are defined by the composition of its 20 amino acids. Based on this, a new method, the so-called maximum component coefficient method, is proposed for predicting the structural class of a protein according to its amino acid composition. In comparison with the existing methods, the new method yields a higher general accuracy of prediction. Especially for the all-alpha proteins, the rate of correct prediction obtained by the new method is much higher than that by any of the existing methods. For instance, for the 19 all-alpha proteins investigated previously by P.Y. Chou, the rate of correct prediction by means of his method was 84.2%, but the correct rate when predicted with the new method would be 100%! Furthermore, the new method is characterized by an explicable physical picture. This is reflected by the process in which the vector representing a protein to be predicted is decomposed into four component vectors, each of which corresponds to one of the norms of the four protein structural classes.  相似文献   

3.
Microorganisms can initiate the degradation of organic compounds by oxygenation reactions that require the investment of energy and electrons. This diversion of energy and electrons away from synthesis reactions leads to decreased overall cell yields. A thermodynamic method was developed that improves the accuracy of cell yield prediction for compounds degraded through pathways involving oxygenation reactions. This method predicts yields and stoichiometry for each step in the biodegradation pathway, thus enabling modeling a multi-step biodegradation process in which oxygenations occur and intermediates may persist. EDTA and benzene biodegradation are presented as examples. The method compares favorably with other yield prediction methods while providing additional information of yields for intermediates produced in the degradation pathway.  相似文献   

4.
利用PPR新方法建立草地产量增长模型及生态成因机理研究   总被引:2,自引:0,他引:2  
利用新型的投影寻踪回归(PPR)统计方法,对新疆天山北坡蒿属荒漠草地上观测的3a草地产量和其它6个相关生态因子数据,进行了多元回归统计分析和PPR分析,建立了PPR新型牧草增长模型,并进而做了产量生态成因机理分析和理论解释,研究结果表明,在一般年份下,干草产量在不同年代不同季节约随降水量的分配状况而发生有规律的变化,产量分配呈现双峰模式;在降水量,积温和光照时数3个影响产量形成和高低的生态因子中,尤以降水量影响最明显,是主导生态因子,但牧草生长发育状况和增长类型则是水热因了,质地,牧草生态习性和种类组成共同作用的结果;在建立草地产量增长模型和成因机理分析过程中,充分运用PPR统计方法成功地克服了非正态非线性高维数据点稀疏所造成的“维数祸根”和自动排除了无关因子及人类因素的干扰,提高了建模的精度和效能,并给出各自变量因子对建模贡献率大小的效果,科学地揭示了用其它多元回归统计方法,所难以发现的数据内在的结构特征和规律性,进而,运用建立的PPR模型,较好地估测和拟合实测产量值,使模型估产精度达90%以上,并通过了效果检验和给出了贡献率统计分析,在6个自变量因子中,对牧草产量增长量贡献值最大的是生长时期(T),其次为土壤含水量(SW),降水量(PR),气温(AT)和土壤温度(ST),最小为草地类型(GT)值,多年研究结果为合理利用天山草地和深入了解草地生态系统形成机理,提供了坚实的科学依据和建模的新方法等。  相似文献   

5.
The accuracy and precision of the National Research Council (NRC), Gesellschaft für Ernährungsphysiologie (GfE) and Institut National de la Recherche Agronomique (INRA) systems for predicting the digestible energy (DE) value of hays were determined from the results of 15 digestibility trials with natural grassland hays and 9 digestibility trials with lucerne hays that all met strict experimental and a tight corpus of methods. The hays were harvested in the temperate zone. They covered broad ranges of chemical composition and DE value. The INRA system was more accurate than the other two systems, with the bias between the predicted and measured DE values of natural grassland and lucerne hays averaging −0.11 and −0.04 MJ/kg DM with the INRA system, 0.34 and −0.70 MJ/kg DM with the NRC system and −0.50 and −1.69 MJ/kg DM with the GfE system (P < 0.05). However, the precision of the three systems was similar; the standard error of prediction corrected by bias was not significantly different (P > 0.05). The GfE system underestimated the DE value of hays, especially of lucerne hays. The differences between the predicted and measured DE values resulted mainly from the errors in the prediction of organic matter digestibility and energy digestibility for both natural grassland and lucerne hays. Discrimination according to botanical family (grassland v. lucerne) can help improve the prediction of the DE value of hays. The choice of appropriate predictive variables is discussed in the light of differences in chemical composition and digestibility of the various cell wall components of grassland and lucerne hays. Neutral detergent fiber (NDF) may thus be preferable to ADF in the prediction equation of the DE value of lucerne hays, whereas ADF and NDF may both be relevant for natural grassland hays.  相似文献   

6.
应用神经网络和多元回归技术预测森林产量   总被引:16,自引:0,他引:16  
应用传统统计技术常会因样本小和测量数据不符某种分布而受到限制。本文评价一种前馈型神经网络算法以预测落叶阔叶林产量。另外,还介绍一种由定性变为定量的数据变换方法,以用相对小的样本建立多元回归预测模型。数据变换方法有助于改善多元回归模型的预测效果。在本实验的条件下,研究结果表明神经网络技术能够产生最好的预测效果.  相似文献   

7.
MIXED MODEL APPROACHES FOR ESTIMATING GENETIC VARIANCES AND COVARIANCES   总被引:62,自引:4,他引:58  
The limitations of methods for analysis of variance(ANOVA)in estimating genetic variances are discussed. Among the three methods(maximum likelihood ML, restricted maximum likelihood REML, and minimum norm quadratic unbiased estimation MINQUE)for mixed linear models, MINQUE method is presented with formulae for estimating variance components and covariances components and for predicting genetic effects. Several genetic models, which cannot be appropriately analyzed by ANOVA methods, are introduced in forms of mixed linear models. Genetic models with independent random effects can be analyzed by MINQUE(1)method whieh is a MINQUE method with all prior values setting 1. MINQUE(1)method can give unbiased estimation for variance components and covariance components, and linear unbiased prediction (LUP) for genetic effects. There are more complicate genetic models for plant seeds which involve correlated random effects. MINQUE(0/1)method, which is a MINQUE method with all prior covariances setting 0 and all prior variances setting 1, is suitable for estimating variance and covariance components in these models. Mixed model approaches have advantage over ANOVA methods for the capacity of analyzing unbalanced data and complicated models. Some problems about estimation and hypothesis test by MINQUE method are discussed.  相似文献   

8.
BACKGROUND: Yield capacity is a target trait for selection of agronomically desirable lines; it is preferred to simple yields recorded over different harvests. Yield capacity is derived using certain architectural parameters used to measure the components of yield capacity. METHODS: Observation protocols for describing architecture and yield capacity were applied to six clones of coffee trees (Coffea canephora) in a comparative trial. The observations were used to establish architectural databases, which were explored using AMAPmod, a software dedicated to the analyses of plant architecture data. The traits extracted from the database were used to identify architectural parameters for predicting the yield of the plant material studied. CONCLUSIONS: Architectural traits are highly heritable and some display strong genetic correlations with cumulated yield. In particular, the proportion of fruiting nodes at plagiotropic level 15 counting from the top of the tree proved to be a good predictor of yield over two fruiting cycles.  相似文献   

9.
Wang ZX  Yuan Z 《Proteins》2000,38(2):165-175
Proteins of known structures are usually classified into four structural classes: all-alpha, all-beta, alpha+beta, and alpha/beta type of proteins. A number of methods to predicting the structural class of a protein based on its amino acid composition have been developed during the past few years. Recently, a component-coupled method was developed for predicting protein structural class according to amino acid composition. This method is based on the least Mahalanobis distance principle, and yields much better predicted results in comparison with the previous methods. However, the success rates reported for structural class prediction by different investigators are contradictory. The highest reported accuracies by this method are near 100%, but the lowest one is only about 60%. The goal of this study is to resolve this paradox and to determine the possible upper limit of prediction rate for structural classes. In this paper, based on the normality assumption and the Bayes decision rule for minimum error, a new method is proposed for predicting the structural class of a protein according to its amino acid composition. The detailed theoretical analysis indicates that if the four protein folding classes are governed by the normal distributions, the present method will yield the optimum predictive result in a statistical sense. A non-redundant data set of 1,189 protein domains is used to evaluate the performance of the new method. Our results demonstrate that 60% correctness is the upper limit for a 4-type class prediction from amino acid composition alone for an unknown query protein. The apparent relatively high accuracy level (more than 90%) attained in the previous studies was due to the preselection of test sets, which may not be adequately representative of all unrelated proteins.  相似文献   

10.
Review: protein secondary structure prediction continues to rise   总被引:15,自引:0,他引:15  
Methods predicting protein secondary structure improved substantially in the 1990s through the use of evolutionary information taken from the divergence of proteins in the same structural family. Recently, the evolutionary information resulting from improved searches and larger databases has again boosted prediction accuracy by more than four percentage points to its current height of around 76% of all residues predicted correctly in one of the three states, helix, strand, and other. The past year also brought successful new concepts to the field. These new methods may be particularly interesting in light of the improvements achieved through simple combining of existing methods. Divergent evolutionary profiles contain enough information not only to substantially improve prediction accuracy, but also to correctly predict long stretches of identical residues observed in alternative secondary structure states depending on nonlocal conditions. An example is a method automatically identifying structural switches and thus finding a remarkable connection between predicted secondary structure and aspects of function. Secondary structure predictions are increasingly becoming the work horse for numerous methods aimed at predicting protein structure and function. Is the recent increase in accuracy significant enough to make predictions even more useful? Because the recent improvement yields a better prediction of segments, and in particular of beta strands, I believe the answer is affirmative. What is the limit of prediction accuracy? We shall see.  相似文献   

11.
内蒙古锡林郭勒盟草原产草量动态遥感估算   总被引:6,自引:0,他引:6  
草原产草量的监测是草地资源空间动态研究的重要衡量指标,是草地资源合理利用和载畜平衡监测的重要依据.基于371个样地调查数据和2005~2009年的MODIS-NDVI遥感数据,建立地面样方的产草量与遥感数据的关系模型,模拟分析了内蒙古锡林郭勒盟草原产草量的时空分布.结果表明:(1)建立的各模型方程均有较好的相关关系,其中幂函数的相关关系最优,通过预留样方数据的验证,模型精度为78%,幂函数模型作为遥感估测应用可行;(2)锡林郭勒盟草原的产草量5年平均为3455万吨,折合干草总量为1112万吨,平均单产为567.23kg/hm2,草原产草量的空间分布呈东高西低的格局;(3)2005~2009年,锡林郭勒盟草原产草量有明显的波动,干草变化范围为800~1400万吨,变异系数为20.42%;(4)不同草地类型的产草量及其年际间变化存在较大的差异,荒漠类草原产草量低,年际间变化较大;草甸类草原产草量高,年际间变化相对较小.草原产草量的时空变化还与降水量、气温等主要气候因素关系密切,特别是受降水量的时空变化影响显著.研究结果可以为中国草地资源的保护及合理利用提供参考依据.  相似文献   

12.
Kaur H  Raghava GP 《Proteins》2004,55(1):83-90
In this paper a systematic attempt has been made to develop a better method for predicting alpha-turns in proteins. Most of the commonly used approaches in the field of protein structure prediction have been tried in this study, which includes statistical approach "Sequence Coupled Model" and machine learning approaches; i) artificial neural network (ANN); ii) Weka (Waikato Environment for Knowledge Analysis) Classifiers and iii) Parallel Exemplar Based Learning (PEBLS). We have also used multiple sequence alignment obtained from PSIBLAST and secondary structure information predicted by PSIPRED. The training and testing of all methods has been performed on a data set of 193 non-homologous protein X-ray structures using five-fold cross-validation. It has been observed that ANN with multiple sequence alignment and predicted secondary structure information outperforms other methods. Based on our observations we have developed an ANN-based method for predicting alpha-turns in proteins. The main components of the method are two feed-forward back-propagation networks with a single hidden layer. The first sequence-structure network is trained with the multiple sequence alignment in the form of PSI-BLAST-generated position specific scoring matrices. The initial predictions obtained from the first network and PSIPRED predicted secondary structure are used as input to the second structure-structure network to refine the predictions obtained from the first net. The final network yields an overall prediction accuracy of 78.0% and MCC of 0.16. A web server AlphaPred (http://www.imtech.res.in/raghava/alphapred/) has been developed based on this approach.  相似文献   

13.
Halperin I  Wolfson H  Nussinov R 《Proteins》2006,63(4):832-845
Correlated mutations have been repeatedly exploited for intramolecular contact map prediction. Over the last decade these efforts yielded several methods for measuring correlated mutations. Nevertheless, the application of correlated mutations for the prediction of intermolecular interactions has not yet been explored. This gap is due to several obstacles, such as 3D complexes availability, paralog discrimination, and the availability of sequence pairs that are required for inter- but not intramolecular analyses. Here we selected for analysis fusion protein families that bypass some of these obstacles. We find that several correlated mutation measurements yield reasonable accuracy for intramolecular contact map prediction on the fusion dataset. However, the accuracy level drops sharply in intermolecular contacts prediction. This drop in accuracy does not occur always. In the Cohesin-Dockerin family, reasonable accuracy is achieved in the prediction of both intra- and intermolecular contacts. The Cohesin-Dockerin family is well suited for correlated mutation analysis. Because, however, this family constitutes a special case (it has radical mutations, has domain repeats, within each species each Dockerin domain interacts with each Cohesin domain, see below), the successful prediction in this family does not point to a general potential in using correlated mutations for predicting intermolecular contacts. Overall, the results of our study indicate that current methodologies of correlated mutations analysis are not suitable for large-scale intermolecular contact prediction, and thus cannot assist in docking. With current measurements, sequence availability, sequence annotations, and underdeveloped sequence pairing methods, correlated mutations can yield reasonable accuracy only for a handful of families.  相似文献   

14.
MOTIVATION: Phylogenetic profiling methods can achieve good accuracy in predicting protein-protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly available, identifying the most-informative RT is becoming increasingly difficult. Previous studies on the selection of RT have provided guidelines for manual taxon selection, and for eliminating closely related taxa. However, no general strategy for automatic selection of RT is currently available. RESULTS: We present three novel methods for automating the selection of RT, using machine learning based on known protein-protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting phylogenetic profiles often require very different RT sets to support high prediction accuracy.  相似文献   

15.
We conducted a simulation study to compare two methods that have been recently used in clinical literature for the dynamic prediction of time to pregnancy. The first is landmarking, a semi-parametric method where predictions are updated as time progresses using the patient subset still at risk at that time point. The second is the beta-geometric model that updates predictions over time from a parametric model estimated on all data and is specific to applications with a discrete time to event outcome. The beta-geometric model introduces unobserved heterogeneity by modelling the chance of an event per discrete time unit according to a beta distribution. Due to selection of patients with lower chances as time progresses, the predicted probability of an event decreases over time. Both methods were recently used to develop models predicting the chance to conceive naturally. The advantages, disadvantages and accuracy of these two methods are unknown. We simulated time-to-pregnancy data according to different scenarios. We then compared the two methods by the following out-of-sample metrics: bias and root mean squared error in the average prediction, root mean squared error in individual predictions, Brier score and c statistic. We consider different scenarios including data-generating mechanisms for which the models are misspecified. We applied the two methods on a clinical dataset comprising 4999 couples. Finally, we discuss the pros and cons of the two methods based on our results and present recommendations for use of either of the methods in different settings and (effective) sample sizes.  相似文献   

16.
Li X  Pan XM 《Proteins》2001,42(1):1-5
A novel method was developed for predicting the solvent accessibility. Based on single sequence data, this method achieved 71.5% accuracy with a correlation coefficient of 0.42 in a database of 704 proteins with threshold of 20% for a two-state-defining solvent accessibility. Prediction in a data subset of 341 monomeric proteins achieved 72.7% accuracy with a correlation coefficient of 0. 43. On the average, prediction over short chains gives better results than that over long chains. With a solvent accessibility threshold of 20%, prediction over 236 monomeric proteins with chain length < 300 amino acid residues achieved 75.3% accuracy with a correlation coefficient of 0.44 by jackknife analysis, which is higher than that obtained by previous methods using multiple sequence alignments.  相似文献   

17.
Using evolutionary information contained in multiple sequence alignments as input to neural networks, secondary structure can be predicted at significantly increased accuracy. Here, we extend our previous three-level system of neural networks by using additional input information derived from multiple alignments. Using a position-specific conservation weight as part of the input increases performance. Using the number of insertions and deletions reduces the tendency for overprediction and increases overall accuracy. Addition of the global amino acid content yields a further improvement, mainly in predicting structural class. The final network system has a sustained overall accuracy of 71.6% in a multiple cross-validation test on 126 unique protein chains. A test on a new set of 124 recently solved protein structures that have no significant sequence similarity to the learning set confirms the high level of accuracy. The average cross-validated accuracy for all 250 sequence-unique chains is above 72%. Using various data sets, the method is compared to alternative prediction methods, some of which also use multiple alignments: the performance advantage of the network system is at least 6 percentage points in three-state accuracy. In addition, the network estimates secondary structure content from multiple sequence alignments about as well as circular dichroism spectroscopy on a single protein and classifies 75% of the 250 proteins correctly into one of four protein structural classes. Of particular practical importance is the definition of a position-specific reliability index. For 40% of all residues the method has a sustained three-state accuracy of 88%, as high as the overall average for homology modelling. A further strength of the method is greatly increased accuracy in predicting the placement of secondary structure segments. © 1994 Wiley-Liss, Inc.  相似文献   

18.

Background

Ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Their secondary structures are crucial for the RNA functionality, and the prediction of the secondary structures is widely studied. Our previous research shows that cutting long sequences into shorter chunks, predicting secondary structures of the chunks independently using thermodynamic methods, and reconstructing the entire secondary structure from the predicted chunk structures can yield better accuracy than predicting the secondary structure using the RNA sequence as a whole. The chunking, prediction, and reconstruction processes can use different methods and parameters, some of which produce more accurate predictions than others. In this paper, we study the prediction accuracy and efficiency of three different chunking methods using seven popular secondary structure prediction programs that apply to two datasets of RNA with known secondary structures, which include both pseudoknotted and non-pseudoknotted sequences, as well as a family of viral genome RNAs whose structures have not been predicted before. Our modularized MapReduce framework based on Hadoop allows us to study the problem in a parallel and robust environment.

Results

On average, the maximum accuracy retention values are larger than one for our chunking methods and the seven prediction programs over 50 non-pseudoknotted sequences, meaning that the secondary structure predicted using chunking is more similar to the real structure than the secondary structure predicted by using the whole sequence. We observe similar results for the 23 pseudoknotted sequences, except for the NUPACK program using the centered chunking method. The performance analysis for 14 long RNA sequences from the Nodaviridae virus family outlines how the coarse-grained mapping of chunking and predictions in the MapReduce framework exhibits shorter turnaround times for short RNA sequences. However, as the lengths of the RNA sequences increase, the fine-grained mapping can surpass the coarse-grained mapping in performance.

Conclusions

By using our MapReduce framework together with statistical analysis on the accuracy retention results, we observe how the inversion-based chunking methods can outperform predictions using the whole sequence. Our chunk-based approach also enables us to predict secondary structures for very long RNA sequences, which is not feasible with traditional methods alone.
  相似文献   

19.
Selection of recombinant inbred lines (RILs) from elite hybrids is a key method in maize breeding especially in developing countries. The RILs are normally derived by repeated self-pollination and selection. In this study, we first investigated the accuracy of different models in predicting the performance of F1 hybrids between RILs derived from two elite maize inbred lines Zong3 and 87-1, and then compared these models through simulation using a wider range of genetic models. Results indicated that appropriate prediction models depended on genetic architecture, e.g., combined model using breeding value and genome-wide prediction (BV+GWP) has the highest prediction accuracy for high V D/V A ratio (>0.5) traits. Theoretical studies demonstrated that different components of genetic variance were captured by different prediction models, which in turn explained the accuracy of these models in predicting the F1 hybrid performance. Based on genome-wide prediction model (GWP), 114 untested F1 hybrids possibly having higher grain yield than the original F1 hybrid Yuyu22 (the single cross between Zong3 and 87-1) have been identified and recommended for further field test.  相似文献   

20.
Chen YC  Wu CY  Lim C 《Proteins》2007,67(3):671-680
Binding of polyanionic DNA depends on the cluster of electropositive atoms in the binding site of a DNA-binding protein. Such a cluster of electropositive protein atoms would be electrostatically unfavorable without stabilizing interactions from the respective electronegative DNA atoms and would likely be evolutionary conserved due to its critical biological role. Consequently, our strategy for predicting DNA-binding residues is based on detecting a cluster of evolutionary conserved surface residues that are electrostatically stabilized upon mutation to negatively charged Asp/Glu residues. The method requires as input the protein structure and sufficient sequence homologs to define each residue's relative conservation, and it yields as output experimentally testable residues that are predicted to bind DNA. By incorporating characteristic DNA-binding site features (i.e., electrostatic strain and amino acid conservation), the new method yields a prediction accuracy of 83%, which is much higher than methods based on only electrostatic strain (57%) or conservation alone (50%). It is also less sensitive to protein conformational changes upon DNA binding than methods that mainly depend on the 3D protein structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号