首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
The availability of user‐friendly software to annotate biological datasets and experimental details is becoming essential in data management practices, both in local storage systems and in public databases. The Ontology Lookup Service (OLS, http://www.ebi.ac.uk/ols ) is a popular centralized service to query, browse and navigate biomedical ontologies and controlled vocabularies. Recently, the OLS framework has been completely redeveloped (version 3.0), including enhancements in the data model, like the added support for Web Ontology Language based ontologies, among many other improvements. However, the new OLS is not backwards compatible and new software tools are needed to enable access to this widely used framework now that the previous version is no longer available. We here present the OLS Client as a free, open‐source Java library to retrieve information from the new version of the OLS. It enables rapid tool creation by providing a robust, pluggable programming interface and common data model to programmatically access the OLS. The library has already been integrated and is routinely used by several bioinformatics resources and related data annotation tools. Secondly, we also introduce an updated version of the OLS Dialog (version 2.0), a Java graphical user interface that can be easily plugged into Java desktop applications to access the OLS. The software and related documentation are freely available at https://github.com/PRIDE-Utilities/ols-client and https://github.com/PRIDE-Toolsuite/ols-dialog .  相似文献   

2.
There have been numerous claims in the ecological literature that spatial autocorrelation in the residuals of ordinary least squares (OLS) regression models results in shifts in the partial coefficients, which bias the interpretation of factors influencing geographical patterns. We evaluate the validity of these claims using gridded species richness data for the birds of North America, South America, Europe, Africa, the ex‐USSR, and Australia. We used richness in 110×110 km cells and environmental predictor variables to generate OLS and simultaneous autoregressive (SAR) multiple regression models for each region. Spatial correlograms of the residuals from each OLS model were then used to identify the minimum distance between cells necessary to avoid short‐distance residual spatial autocorrelation in each data set. This distance was used to subsample cells to generate spatially independent data. The partial OLS coefficients estimated with the full dataset were then compared to the distributions of coefficients created with the subsamples. We found that OLS coefficients generated from data containing residual spatial autocorrelation were statistically indistinguishable from coefficients generated from the same data sets in which short‐distance spatial autocorrelation was not present in all 22 coefficients tested. Consistent with the statistical literature on this subject, we conclude that coefficients estimated from OLS regression are not seriously affected by the presence of spatial autocorrelation in gridded geographical data. Further, shifts in coefficients that occurred when using SAR tended to be correlated with levels of uncertainty in the OLS coefficients. Thus, shifts in the relative importance of the predictors between OLS and SAR models are expected when small‐scale patterns for these predictors create weaker and more unstable broad‐scale coefficients. Our results indicate both that OLS regression is unbiased and that differences between spatial and nonspatial regression models should be interpreted with an explicit awareness of spatial scale.  相似文献   

3.
This first article of a two‐article series describes a framework and life cycle–based model for typical almond orchard production systems for California, where more than 80% of commercial almonds on the world market are produced. The comprehensive, multiyear, life cycle–based model includes orchard establishment and removal; field operations and inputs; emissions from orchard soils; and transport and utilization of co‐products. These processes are analyzed to yield a life cycle inventory of energy use, greenhouse gas (GHG) emissions, criteria air pollutants, and direct water use from field to factory gate. Results show that 1 kilogram (kg) of raw almonds and associated co‐products of hulls, shells, and woody biomass require 35 megajoules (MJ) of energy and result in 1.6 kg carbon dioxide equivalent (CO2‐eq) of GHG emissions. Nitrogen fertilizer and irrigation water are the dominant causes of both energy use and GHG emissions. Co‐product credits play an important role in estimating the life cycle environmental impacts attributable to almonds alone; using displacement methods results in net energy and emissions of 29 MJ and 0.9 kg CO2‐eq/kg. The largest sources of credits are from orchard biomass and shells used in electricity generation, which are modeled as displacing average California electricity. Using economic allocation methods produces significantly different results; 1 kg of almonds is responsible for 33 MJ of energy and 1.5 kg CO2‐eq emissions. Uncertainty analysis of important parameters and assumptions, as well as temporary carbon storage in orchard trees and soils, are explored in the second article of this two‐part article series.  相似文献   

4.
A series of experiments was conducted using small wind tunnels to assess the influence of a range of environmental, manure and management variables on ammonia emissions following application of different manure types to grassland and arable land. Wind speed and dry matter content (for cattle slurry in particular) were identified as the parameters with greatest influence on ammonia emissions from slurries. For solid manures, rainfall was identified as the parameter with most influence on ammonia emissions. A Michaelis-Menten function was used to describe emission rates following manure application. Linear regression was then used to develop statistical models relating the Michaelis-Menten function parameters to the experimental variables for each manure type/land use combination. The fitted models accounted for between 62% and 94% of the variation in the data. Validation of the models for cattle slurry to grassland and pig slurry to arable land against independent data sets obtained from experiments using the micrometeorological mass balance measurement technique showed that the models overestimated losses, which was most probably due to inherent differences between the wind tunnel and the micrometerological mass balance measurement techniques.  相似文献   

5.
Recently, the stable light products and radiance calibrated products from Defense Meteorological Satellite Program’s (DMSP) Operational Linescan System (OLS) have been useful for mapping global fossil fuel carbon dioxide (CO2) emissions at fine spatial resolution. However, few studies on this subject were conducted with the new-generation nighttime light data from the Visible Infrared Imaging Radiometer Suite (VIIRS) sensor on the Suomi National Polar-orbiting Partnership (NPP) Satellite, which has a higher spatial resolution and a wider radiometric detection range than the traditional DMSP-OLS nighttime light data. Therefore, this study performed the first evaluation of the potential of NPP-VIIRS data in estimating the spatial distributions of global CO2 emissions (excluding power plant emissions). Through a disaggregating model, three global emission maps were then derived from population counts and three different types of nighttime lights data (NPP-VIIRS, the stable light data and radiance calibrated data of DMSP-OLS) for a comparative analysis. The results compared with the reference data of land cover in Beijing, Shanghai and Guangzhou show that the emission areas of map from NPP-VIIRS data have higher spatial consistency of the artificial surfaces and exhibit a more reasonable distribution of CO2 emission than those of other two maps from DMSP-OLS data. Besides, in contrast to two maps from DMSP-OLS data, the emission map from NPP-VIIRS data is closer to the Vulcan inventory and exhibits a better agreement with the actual statistical data of CO2 emissions at the level of sub-administrative units of the United States. This study demonstrates that the NPP-VIIRS data can be a powerful tool for studying the spatial distributions of CO2 emissions, as well as the socioeconomic indicators at multiple scales.  相似文献   

6.
Cyanobacteria possess the unique capacity to naturally produce hydrocarbons from fatty acids. Hydrocarbon compositions of thirty-two strains of cyanobacteria were characterized to reveal novel structural features and insights into hydrocarbon biosynthesis in cyanobacteria. This investigation revealed new double bond (2- and 3-heptadecene) and methyl group positions (3-, 4- and 5-methylheptadecane) for a variety of strains. Additionally, results from this study and literature reports indicate that hydrocarbon production is a universal phenomenon in cyanobacteria. All cyanobacteria possess the capacity to produce hydrocarbons from fatty acids yet not all accomplish this through the same metabolic pathway. One pathway comprises a two-step conversion of fatty acids first to fatty aldehydes and then alkanes that involves a fatty acyl ACP reductase (FAAR) and aldehyde deformylating oxygenase (ADO). The second involves a polyketide synthase (PKS) pathway that first elongates the acyl chain followed by decarboxylation to produce a terminal alkene (olefin synthase, OLS). Sixty-one strains possessing the FAAR/ADO pathway and twelve strains possessing the OLS pathway were newly identified through bioinformatic analyses. Strains possessing the OLS pathway formed a cohesive phylogenetic clade with the exception of three Moorea strains and Leptolyngbya sp. PCC 6406 which may have acquired the OLS pathway via horizontal gene transfer. Hydrocarbon pathways were identified in one-hundred-forty-two strains of cyanobacteria over a broad phylogenetic range and there were no instances where both the FAAR/ADO and the OLS pathways were found together in the same genome, suggesting an unknown selective pressure maintains one or the other pathway, but not both.  相似文献   

7.
Many investigators use the reduced major axis (RMA) instead of ordinary least squares (OLS) to define a line of best fit for a bivariate relationship when the variable represented on the X‐axis is measured with error. OLS frequently is described as requiring the assumption that X is measured without error while RMA incorporates an assumption that there is error in X. Although an RMA fit actually involves a very specific pattern of error variance, investigators have prioritized the presence versus the absence of error rather than the pattern of error in selecting between the two methods. Another difference between RMA and OLS is that RMA is symmetric, meaning that a single line defines the bivariate relationship, regardless of which variable is X and which is Y, while OLS is asymmetric, so that the slope and resulting interpretation of the data are changed when the variables assigned to X and Y are reversed. The concept of error is reviewed and expanded from previous discussions, and it is argued that the symmetry‐asymmetry issue should be the criterion by which investigators choose between RMA and OLS. This is a biological question about the relationship between variables. It is determined by the investigator, not dictated by the pattern of error in the data. If X is measured with error but OLS should be used because the biological question is asymmetric, there are several methods available for adjusting the OLS slope to reflect the bias due to error. RMA is being used in many analyses for which OLS would be more appropriate. Am J Phys Anthropol, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

8.
When it comes to fitting simple allometric slopes through measurement data, evolutionary biologists have been torn between regression methods. On the one hand, there is the ordinary least squares (OLS) regression, which is commonly used across many disciplines of biology to fit lines through data, but which has a reputation for underestimating slopes when measurement error is present. On the other hand, there is the reduced major axis (RMA) regression, which is often recommended as a substitute for OLS regression in studies of allometry, but which has several weaknesses of its own. Here, we review statistical theory as it applies to evolutionary biology and studies of allometry. We point out that the concerns that arise from measurement error for OLS regression are small and straightforward to deal with, whereas RMA has several key properties that make it unfit for use in the field of allometry. The recommended approach for researchers interested in allometry is to use OLS regression on measurements taken with low (but realistically achievable) measurement error. If measurement error is unavoidable and relatively large, it is preferable to correct for slope attenuation rather than to turn to RMA regression, or to take the expected amount of attenuation into account when interpreting the data.  相似文献   

9.
基于Landsat TM土地覆盖分类数据和MODIS地表温度数据,探讨京津唐城市群不同土地覆盖的地表温度(7日),并采用常用的普通线性回归(OLS)和地理加权回归(GWR)方法分别拟合土地覆盖比例与地表温度的关系.结果表明: 研究区不同土地覆盖类型的地表温度差异明显,人工表面(40.92±3.49 ℃)和耕地(39.74±3.74 ℃)的平均温度较高,林地(34.43±4.16 ℃)和湿地(35.42±4.33 ℃)的平均温度较低;土地覆盖比例与地表温度显著相关,且两者之间的定量关系存在空间非稳定性,地理位置以及周围环境影响的差异是空间非稳定性产生的主要原因;GWR模型的拟合结果优于OLS模型(RGWR2>ROLS2),并且GWR模型可以量化土地覆盖比例与地表温度两者关系的空间非稳定性特征.  相似文献   

10.
Ordinary least square (OLS) in regression has been widely used to analyze patient-level data in cost-effectiveness analysis (CEA). However, the estimates, inference and decision making in the economic evaluation based on OLS estimation may be biased by the presence of outliers. Instead, robust estimation can remain unaffected and provide result which is resistant to outliers. The objective of this study is to explore the impact of outliers on net-benefit regression (NBR) in CEA using OLS and to propose a potential solution by using robust estimations, i.e. Huber M-estimation, Hampel M-estimation, Tukey''s bisquare M-estimation, MM-estimation and least trimming square estimation. Simulations under different outlier-generating scenarios and an empirical example were used to obtain the regression estimates of NBR by OLS and five robust estimations. Empirical size and empirical power of both OLS and robust estimations were then compared in the context of hypothesis testing.Simulations showed that the five robust approaches compared with OLS estimation led to lower empirical sizes and achieved higher empirical powers in testing cost-effectiveness. Using real example of antiplatelet therapy, the estimated incremental net-benefit by OLS estimation was lower than those by robust approaches because of outliers in cost data. Robust estimations demonstrated higher probability of cost-effectiveness compared to OLS estimation. The presence of outliers can bias the results of NBR and its interpretations. It is recommended that the use of robust estimation in NBR can be an appropriate method to avoid such biased decision making.  相似文献   

11.
Testimation is considered in the problem of estimation of regression parameters. The first stage sample is used to test a (null) hypothesis that specifies initial (preassumed) values for some of the regression parameters. Linear combination of the preassumed values and the ordinary least square (OLS) estimates is considered as the estimate if the data agree with the hypothesis. Otherwise, a second sample is taken and parameters are estimated only by using OLS, based on the combined sample. The procedure protects against type II error and against taking larger samples when inference can be made from a smaller sample.  相似文献   

12.
A genetic model was proposed to simultaneously investigate genetic effects of both polygenes and several single genes for quantitative traits of diploid plants and animals. Mixed linear model approaches were employed for statistical analysis. Based on two mating designs, a full diallel cross and a modified diallel cross including F2, Monte Carlo simulations were conducted to evaluate the unbiasedness and efficiency of the estimation of generalized least squares (GLS) and ordinary least squares (OLS) for fixed effects and of minimum norm quadratic unbiased estimation (MINQUE) and Henderson III for variance components. Estimates of MINQUE (1) were unbiased and efficient in both reduced and full genetic models. Henderson III could have a large bias when used to analyze the full genetic model. Simulation results also showed that GLS and OLS were good methods to estimate fixed effects in the genetic models. Data on Drosophila melanogaster from Gilbert were used as a worked example to demonstrate the parameter estimation. Received: 11 November 2000 / Accepted: 2 May 2001  相似文献   

13.
Summary Statistical properties of the ordinary least-squares (OLS), generalized least-squares (GLS), and minimum-evolution (ME) methods of phylogenetic inference were studied by considering the case of four DNA sequences. Analytical study has shown that all three methods are statistically consistent in the sense that as the number of nucleotides examined (m) increases they tend to choose the true tree as long as the evolutionary distances used are unbiased. When evolutionary distances (dij's) are large and sequences under study are not very long, however, the OLS criterion is often biased and may choose an incorrect tree more often than expected under random choice. It is also shown that the variance-covariance matrix of dij's becomes singular as dij's approach zero and thus the GLS may not be applicable when dij's are small. The ME method suffers from neither of these problems, and the ME criterion is statistically unbiased. Computer simulation has shown that the ME method is more efficient in obtaining the true tree than the OLS and GLS methods and that the OLS is more efficient than the GLS when dij's are small, but otherwise the GLS is more efficient.Offprint requests to: M. Nei  相似文献   

14.
Classically, hypotheses concerning the distribution of species have been explored by evaluating the relationship between species richness and environmental variables using ordinary least squares (OLS) regression. However, environmental and ecological data generally show spatial autocorrelation, thus violating the assumption of independently distributed errors. When spatial autocorrelation exists, an alternative is to use autoregressive models that assume spatially autocorrelated errors. We examined the relationship between mammalian species richness in South America and environmental variables, thereby evaluating the relative importance of four competing hypotheses to explain mammalian species richness. Additionally, we compared the results of ordinary least squares (OLS) regression and spatial autoregressive models using Conditional and Simultaneous Autoregressive (CAR and SAR, respectively) models. Variables associated with productivity were the most important at determining mammalian species richness at the scale analyzed. Whereas OLS residuals between species richness and environmental variables were strongly autocorrelated, those from autoregressive models showed less spatial autocorrelation, particularly the SAR model, indicating its suitability for these data. Autoregressive models also fit the data better than the OLS model (increasing R2 by 5–14%), and the relative importance of the explanatory variables shifted under CAR and SAR models. These analyses underscore the importance of controlling for spatial autocorrelation in biogeographical studies.  相似文献   

15.
This study examines the impacts of income, energy consumption and population growth on CO2 emissions by employing an annual time series data for the period 1970–2012 for India, Indonesia, China, and Brazil. The study used the Autoregressive Distributed Lag (ARDL) bounds test approach considering both the linear and non-linear assumptions for related time series data for the top CO2 emitter emerging countries in both the short run and long run. The results show that CO2 emissions have increased statistically significantly with increases in income and energy consumption in all four countries. While the relationship between CO2 emissions and population growth was found to be statistically significant for India and Brazil, it has been statistically insignificant for China and Indonesia in both the short run and long run. Also, empirical observations from the testing of environmental Kuznets curve (EKC) hypothesis imply that in the cases of Brazil, China and Indonesia, CO2 emissions will decrease over the time when income increases. So based on the EKC findings, it can be argued that these three countries should not take any actions or policies, which might have conservative impacts on income, in order to reduce their CO2 emissions. But in the case of India, where CO2 emissions and income were found to have a positive relationship, an increase in income over the time will not reduce CO2 emissions in the country.  相似文献   

16.
Acoustic call sequences are important components of vocal repertoires for many animal species. Bottlenose dolphins (Tursiops truncatus) produce a wide variety of vocalizations, in different behavioural contexts, including some conspicuous vocal sequences – the ‘bray series’. The occurrence of brays is still insufficiently documented, contextually and geographically, and the specific functions of these multi-unit emissions are yet to be understood. Here, acoustic emissions produced by bottlenose dolphins in the Sado estuary, Portugal, were used to provide a structural characterization of the discrete elements that compose the bray series. Information theory techniques were applied to analyse bray sequences and explore the complexity of these calls. Log-frequency analysis, based on bout criterion interval, confirmed the bout structure of the bray series. A first-order Markov model revealed a distinct pattern of emission for the bray series’ elements, with uneven transitions between elements. The order in these sequential emissions was not random and consecutive decreases in higher order entropy values support the notion of a well-defined structure in the bray series. The key features of animal signal sequences here portrayed suggest the presence of relevant information content and highlight the complexity of the bottlenose dolphin’s acoustic repertoire.  相似文献   

17.
Alkylresorcinol moieties of cannabinoids are derived from olivetolic acid (OLA), a polyketide metabolite. However, the polyketide synthase (PKS) responsible for OLA biosynthesis has not been identified. In the present study, a cDNA encoding a novel PKS, olivetol synthase (OLS), was cloned from Cannabis sativa. Recombinant OLS did not produce OLA, but synthesized olivetol, the decarboxylated form of OLA, as the major reaction product. Interestingly, it was also confirmed that the crude enzyme extracts from flowers and rapidly expanding leaves, the cannabinoid-producing tissues of C. sativa, also exhibited olivetol-producing activity, suggesting that the native OLS is functionally expressed in these tissues. The possibility that OLS could be involved in OLA biosynthesis was discussed based on its catalytic properties and expression profile.  相似文献   

18.
Aim The objective of this paper is to obtain a net primary production (NPP) regression model based on the geographically weighted regression (GWR) method, which includes spatial non‐stationarity in the parameters estimated for forest ecosystems in China. Location We used data across China. Methods We examine the relationships between NPP of Chinese forest ecosystems and environmental variables, specifically altitude, temperature, precipitation and time‐integrated normalized difference vegetation index (TINDVI) based on the ordinary least squares (OLS) regression, the spatial lag model and GWR methods. Results The GWR method made significantly better predictions of NPP in simulations than did OLS, as indicated both by corrected Akaike Information Criterion (AICc) and R2. GWR provided a value of 4891 for AICc and 0.66 for R2, compared with 5036 and 0.58, respectively, by OLS. GWR has the potential to reveal local patterns in the spatial distribution of a parameter, which would be ignored by the OLS approach. Furthermore, OLS may provide a false general relationship between spatially non‐stationary variables. Spatial autocorrelation violates a basic assumption of the OLS method. The spatial lag model with the consideration of spatial autocorrelation had improved performance in the NPP simulation as compared with OLS (5001 for AICc and 0.60 for R2), but it was still not as good as that via the GWR method. Moreover, statistically significant positive spatial autocorrelation remained in the NPP residuals with the spatial lag model at small spatial scales, while no positive spatial autocorrelation across spatial scales can be found in the GWR residuals. Conclusions We conclude that the regression analysis for Chinese forest NPP with respect to environmental factors and based alternatively on OLS, the spatial lag model, and GWR methods indicated that there was a significant improvement in model performance of GWR over OLS and the spatial lag model.  相似文献   

19.
回归模型可用于预测森林生态系统地上生物量,其中最为常用的是最小二乘回归模型。在预测灌木,尤其是多茎灌木的地上生物量 时,最小二乘法与贝叶斯方法的比较很少被研究。我们开发了小叶锦鸡儿(Caragana microphylla Lam.)生物量预测模型。小叶锦鸡儿是科尔 沁沙地广泛分布的多茎灌木,对减少风蚀、固定沙丘具有重要作用。本研究建立6种表征生物量的异速增长模型,并基于统计标准选择 在预测生物量方面表现最佳的1种,然后分别用最小二乘法与贝叶斯方法对模型中的参数进行估计。参数估计过程中用自助法考察样本量大 小的影响,同时区分测试集与训练集。最后,我们比较了最小二乘法与贝叶斯方法在小叶锦鸡儿地上生物量预测中的表现。异速增长的6个 模型均达到显著水平,其中幂指数为1的模型表现最佳。研究结果表明,采用无先验信息与有先验信息的贝叶斯方法进行估计,得到的均 方误差在测试集上低于最小二乘法。另外,基径作为预测变量在最小二乘法与贝叶斯方法中均不显著,表明在生物量预测模型中应谨慎选 择合适变量。本研究强调贝叶斯方法、自助法和异速增长模型相结合能够提升沙地灌木生物量预测模型的准确度。  相似文献   

20.
Aim  In their recent paper, Kissling & Carl (2008 ) recommended the spatial error simultaneous autoregressive model (SARerr) over ordinary least squares (OLS) for modelling species distribution. We compared these models with the generalized least squares model (GLS) and a variant of SAR (SARvario). GLS and SARvario are superior to standard implementations of SAR because the spatial covariance structure is described by a semivariogram model.
Innovation  We used the complete datasets employed by Kissling & Carl (2008 ), with strong spatial autocorrelation, and two datasets in which the spatial structure was degraded by sample reduction and grid coarsening. GLS performed consistently better than OLS, SARerr and SARvario in all datasets, especially in terms of goodness of fit. SARvario was marginally better than SARerr in the degraded datasets.
Main conclusions  GLS was more reliable than SAR-based models, so its use is recommended when dealing with spatially autocorrelated data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号