期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

FLUXNET and modelling the global carbon cycle 总被引：3，自引：0，他引：3

ANDREW D. FRIEND ALMUT ARNETH† NANCY Y. KIANG‡ MARK LOMAS§ JÉRÔME OGÉE¶ CHRISTIAN RÖDENBECK&#; STEVEN W. RUNNING JEAN-DIEGO SANTAREN STEPHEN SITCH†† NICOLAS VIOVY F. IAN WOODWARD§ SÖNKE ZAEHLE†† 《Global Change Biology》2007,13(3):610-633

Measurements of the net CO₂ flux between terrestrial ecosystems and the atmosphere using the eddy covariance technique have the potential to underpin our interpretation of regional CO₂ source–sink patterns, CO₂ flux responses to forcings, and predictions of the future terrestrial C balance. Information contained in FLUXNET eddy covariance data has multiple uses for the development and application of global carbon models, including evaluation/validation, calibration, process parameterization, and data assimilation. This paper reviews examples of these uses, compares global estimates of the dynamics of the global carbon cycle, and suggests ways of improving the utility of such data for global carbon modelling. Net ecosystem exchange of CO₂ (NEE) predicted by different terrestrial biosphere models compares favourably with FLUXNET observations at diurnal and seasonal timescales. However, complete model validation, particularly over the full annual cycle, requires information on the balance between assimilation and decomposition processes, information not readily available for most FLUXNET sites. Site history, when known, can greatly help constrain the model‐data comparison. Flux measurements made over four vegetation types were used to calibrate the land‐surface scheme of the Goddard Institute for Space Studies global climate model, significantly improving simulated climate and demonstrating the utility of diurnal FLUXNET data for climate modelling. Land‐surface temperatures in many regions cool due to higher canopy conductances and latent heat fluxes, and the spatial distribution of CO₂ uptake provides a significant additional constraint on the realism of simulated surface fluxes. FLUXNET data are used to calibrate a global production efficiency model (PEM). This model is forced by satellite‐measured absorbed radiation and suggests that global net primary production (NPP) increased 6.2% over 1982–1999. Good agreement is found between global trends in NPP estimated by the PEM and a dynamic global vegetation model (DGVM), and between the DGVM and estimates of global NEE derived from a global inversion of atmospheric CO₂ measurements. Combining the PEM, DGVM, and inversion results suggests that CO₂ fertilization is playing a major role in current increases in NPP, with lesser impacts from increasing N deposition and growing season length. Both the PEM and the inversion identify the Amazon basin as a key region for the current net terrestrial CO₂ uptake (i.e. 33% of global NEE), as well as its interannual variability. The inversion's global NEE estimate of −1.2 Pg [C] yr⁻¹ for 1982–1995 is compatible with the PEM‐ and DGVM‐predicted trends in NPP. There is, thus, a convergence in understanding derived from process‐based models, remote‐sensing‐based observations, and inversion of atmospheric data. Future advances in field measurement techniques, including eddy covariance (particularly concerning the problem of night‐time fluxes in dense canopies and of advection or flow distortion over complex terrain), will result in improved constraints on land‐atmosphere CO₂ fluxes and the rigorous attribution of mechanisms to the current terrestrial net CO₂ uptake and its spatial and temporal heterogeneity. Global ecosystem models play a fundamental role in linking information derived from FLUXNET measurements to atmospheric CO₂ variability. A number of recommendations concerning FLUXNET data are made, including a request for more comprehensive site data (particularly historical information), more measurements in undisturbed ecosystems, and the systematic provision of error estimates. The greatest value of current FLUXNET data for global carbon cycle modelling is in evaluating process representations, rather than in providing an unbiased estimate of net CO₂ exchange. 相似文献

2.

Integrated statistical analysis of cDNA microarray and NIR spectroscopic data applied to a hemp dataset

Reijmers TH Maliepaard C van den Broeck HC Kessler RW Toonen MA van der Voet H 《Journal of bioinformatics and computational biology》2005,3(4):891-913

Both cDNA microarray and spectroscopic data provide indirect information about the chemical compounds present in the biological tissue under consideration. In this paper simple univariate and bivariate measures are used to investigate correlations between both types of high dimensional analyses. A large dataset of 42 hemp samples on which 3456 cDNA clones and 351 NIR wavelengths have been measured, was analyzed using graphical representations. For this purpose we propose clustered correlation and clustered discrimination images. Large, tissue-related differences are seen to dominate the cDNA-NIR correlation structure but smaller, more difficult to detect, variety-related differences can be found at specific cDNA clone/NIR wavelength combinations. 相似文献

3.

Regional electricity consumption mixes using trade data for representative inventories

Hottle Troy Ghosh Tapajyoti 《The International Journal of Life Cycle Assessment》2021,26(6):1211-1222

The International Journal of Life Cycle Assessment - Electricity flows are frequently used life cycle assessment modeling and are often a significant source of emissions so creating inventories... 相似文献

4.

An effective statistical evaluation of ChIPseq dataset similarity

Chikina MD Troyanskaya OG 《Bioinformatics (Oxford, England)》2012,28(5):607-613

相似文献

5.

Peptide vaccine models using statistical data mining

Joshi RR 《Protein and peptide letters》2007,14(6):536-542

Design and synthesis of peptide vaccines is of significant pharmaceutical importance. A knowledge based statistical model is fitted here for prediction of binding of an antigenic site of a protein or a B-cell epitope on a CDR (complementarity determining region) of an immunoglobulin. Linear analogues of the 3D structure of the epitopes are computed using this model. Extension for prediction of peptide epitopes from the protein sequence alone is also presented. Validation results show promising potential of this approach in computer-aided peptide vaccine production. The computed probabilities of binding also provide a pioneering approach for ab-initio prediction of 'potency' of protein or peptide vaccines modeled by this method. 相似文献

6.

On the statistical assessment of classifiers using DNA microarray data

N Ancona R Maglietta A Piepoli A D'Addabbo R Cotugno M Savino S Liuni M Carella G Pesole F Perri 《BMC bioinformatics》2006,7(1):387-14

Background

In this paper we present a method for the statistical assessment of cancer predictors which make use of gene expression profiles. The methodology is applied to a new data set of microarray gene expression data collected in Casa Sollievo della Sofferenza Hospital, Foggia – Italy. The data set is made up of normal (22) and tumor (25) specimens extracted from 25 patients affected by colon cancer. We propose to give answers to some questions which are relevant for the automatic diagnosis of cancer such as: Is the size of the available data set sufficient to build accurate classifiers? What is the statistical significance of the associated error rates? In what ways can accuracy be considered dependant on the adopted classification scheme? How many genes are correlated with the pathology and how many are sufficient for an accurate colon cancer classification? The method we propose answers these questions whilst avoiding the potential pitfalls hidden in the analysis and interpretation of microarray data.

Results

We estimate the generalization error, evaluated through the Leave-K-Out Cross Validation error, for three different classification schemes by varying the number of training examples and the number of the genes used. The statistical significance of the error rate is measured by using a permutation test. We provide a statistical analysis in terms of the frequencies of the genes involved in the classification. Using the whole set of genes, we found that the Weighted Voting Algorithm (WVA) classifier learns the distinction between normal and tumor specimens with 25 training examples, providing e = 21% (p = 0.045) as an error rate. This remains constant even when the number of examples increases. Moreover, Regularized Least Squares (RLS) and Support Vector Machines (SVM) classifiers can learn with only 15 training examples, with an error rate of e = 19% (p = 0.035) and e = 18% (p = 0.037) respectively. Moreover, the error rate decreases as the training set size increases, reaching its best performances with 35 training examples. In this case, RLS and SVM have error rates of e = 14% (p = 0.027) and e = 11% (p = 0.019). Concerning the number of genes, we found about 6000 genes (p < 0.05) correlated with the pathology, resulting from the signal-to-noise statistic. Moreover the performances of RLS and SVM classifiers do not change when 74% of genes is used. They progressively reduce up to e = 16% (p < 0.05) when only 2 genes are employed. The biological relevance of a set of genes determined by our statistical analysis and the major roles they play in colorectal tumorigenesis is discussed.

Conclusions

The method proposed provides statistically significant answers to precise questions relevant for the diagnosis and prognosis of cancer. We found that, with as few as 15 examples, it is possible to train statistically significant classifiers for colon cancer diagnosis. As for the definition of the number of genes sufficient for a reliable classification of colon cancer, our results suggest that it depends on the accuracy required. 相似文献

7.

Multivariate statistical and other approaches for the separation of cereal from wild Poaceae pollen using a large Holocene dataset

John C. Tweddle Kevin J. Edwards Nick R. J. Fieller 《Vegetation History and Archaeobotany》2005,14(1):15-30

The separation of the pollen of wild Poaceae species from that of domesticated cereal crops is of considerable importance to palynologists studying Holocene vegetational and agricultural change. Studies of the characteristics of modern pollen populations indicate that it may be possible to distinguish cereal pollen from that of many (but not all) undomesticated Poaceae species, though there are few detailed investigations into the applicability of such studies to palaeoecological samples. This paper assesses the reliability of available keys for identifying sub-fossil grass pollen using a large Holocene dataset obtained from a series of well-dated profiles from lowland Yorkshire, England. Pollen within the dataset is classified using the keys of Andersen (Danmarks Geol Undersøgelse, Arbog, 1978, 69–92, 1979) and Küster (1988), and the resulting identifications are compared. The possibilities of combining the two approaches and employing the multivariate statistical techniques of principal component and discriminant analysis to achieve greater confidence of identification are then investigated. Finally, the findings of the above analyses are used to discuss the interpretation of incidences of large Poaceae pollen (i.e. >37 m grain diameter as measured in silicone oil) within the palynological record, particularly during prehistory. The outcomes of this study will be of relevance to other investigations in which careful identification of large grass pollen is desirable, but where preservation or other factors prohibit accurate or confident identification of pollen surface pattern. 相似文献

8.

Improved statistical inference from DNA microarray data using analysis of variance and a Bayesian statistical framework. Analysis of global gene expression in Escherichia coli K12 总被引：4，自引：0，他引：4

Long AD Mangalam HJ Chan BY Tolleri L Hatfield GW Baldi P 《The Journal of biological chemistry》2001,276(23):19937-19944

We describe statistical methods based on the t test that can be conveniently used on high density array data to test for statistically significant differences between treatments. These t tests employ either the observed variance among replicates within treatments or a Bayesian estimate of the variance among replicates within treatments based on a prior estimate obtained from a local estimate of the standard deviation. The Bayesian prior allows statistical inference to be made from microarray data even when experiments are only replicated at nominal levels. We apply these new statistical tests to a data set that examined differential gene expression patterns in IHF(+) and IHF(-) Escherichia coli cells (Arfin, S. M., Long, A. D., Ito, E. T., Tolleri, L., Riehle, M. M., Paegle, E. S., and Hatfield, G. W. (2000) J. Biol. Chem. 275, 29672-29684). These analyses identify a more biologically reasonable set of candidate genes than those identified using statistical tests not incorporating a Bayesian prior. We also show that statistical tests based on analysis of variance and a Bayesian prior identify genes that are up- or down-regulated following an experimental manipulation more reliably than approaches based only on a t test or fold change. All the described tests are implemented in a simple-to-use web interface called Cyber-T that is located on the University of California at Irvine genomics web site. 相似文献

9.

Predicting the onset of net carbon uptake by deciduous forests with soil temperature and climate data: a synthesis of FLUXNET data

Baldocchi DD Black TA Curtis PS Falge E Fuentes JD Granier A Gu L Knohl A Pilegaard K Schmid HP Valentini R Wilson K Wofsy S Xu L Yamamoto S 《International journal of biometeorology》2005,49(6):377-387

We tested the hypothesis that the date of the onset of net carbon uptake by temperate deciduous forest canopies corresponds with the time when the mean daily soil temperature equals the mean annual air temperature. The hypothesis was tested using over 30 site-years of data from 12 field sites where CO₂ exchange is being measured continuously with the eddy covariance method. The sites spanned the geographic range of Europe, North America and Asia and spanned a climate space of 16°C in mean annual temperature. The tested phenology rule was robust and worked well over a 75 day range of the initiation of carbon uptake, starting as early as day 88 near Ione, California to as late as day 147 near Takayama, Japan. Overall, we observed that 64% of variance in the timing when net carbon uptake started was explained by the date when soil temperature matched the mean annual air temperature. We also observed a strong correlation between mean annual air temperature and the day that a deciduous forest starts to be a carbon sink. Consequently we are able to provide a simple phenological rule that can be implemented in regional carbon balance models and be assessed with soil and temperature outputs produced by climate and weather models. 相似文献

10.

Challenges and opportunities in synthesizing historical geospatial data using statistical models

《Ecological Informatics》2016

We classified land cover types from 1940s historical aerial imagery using Object Based Image Analysis (OBIA) and compared these maps with data on recent cover. Few studies have used these kinds of maps to model drivers of cover change, partly due to two statistical challenges: 1) appropriately accounting for spatial autocorrelation and 2) appropriately modeling percent cover which is bounded between 0 and 100 and not normally distributed. We studied the change in woody cover at four sites in California's North Coast using historical (1948) and recent (2009) high spatial resolution imagery. We classified the imagery using eCognition Developer and aggregated the resulting maps to the scale of a Digital Elevation Model (DEM) in order to understand topographic drivers of woody cover change. We used Generalized Additive Models (GAMs) with a quasi-binomial probability distribution to account for spatial autocorrelation and the boundedness of the percent woody cover variable. We explored the relative influences on current percent woody cover of topographic variables (grouped using principal component analysis) reflecting water retention capacity, exposure, and within-site context, as well as historical percent woody cover and geographical coordinates. We estimated these models for pixel sizes of 20, 30, 40, 50, 60, 70, 80, 90, and 100 m, reflecting both tree neighborhood scales and stand scales. We found that historical woody cover had a consistent positive effect on current woody cover, and that the spatial autoregressive term in the model was significant even after controlling for historical cover. Specific topographic variables emerged as important for different sites at different scales, but no overall pattern emerged across sites or scales for any of the topographic variables we tested. This GAM framework for modeling historical data is flexible and could be used with more variables, more flexible relationships with predictor variables, and larger scales. Modeling drivers of woody cover change from historical ecology data sources can be a valuable way to plan restoration and enhance ecological insight into landscape change. 相似文献

11.

The completeness of taxonomic inventories for describing the global diversity and distribution of marine fishes

Mora C Tittensor DP Myers RA 《Proceedings. Biological sciences / The Royal Society》2008,275(1631):149-155

相似文献

12.

Comparison of statistical population reconstruction using full and pooled adult age-class data

Skalski JR Millspaugh JJ Clawson MV 《PloS one》2012,7(3):e33910

Background

Age-at-harvest data are among the most commonly collected, yet neglected, demographic data gathered by wildlife agencies. Statistical population construction techniques can use this information to estimate the abundance of wild populations over wide geographic areas and concurrently estimate recruitment, harvest, and natural survival rates. Although current reconstruction techniques use full age-class data (0.5, 1.5, 2.5, 3.5, … years), it is not always possible to determine an animal''s age due to inaccuracy of the methods, expense, and logistics of sample collection. The ability to inventory wild populations would be greatly expanded if pooled adult age-class data (e.g., 0.5, 1.5, 2.5+ years) could be successfully used in statistical population reconstruction.

Methodology/Principal Findings

We investigated the performance of statistical population reconstruction models developed to analyze full age-class and pooled adult age-class data. We performed Monte Carlo simulations using a stochastic version of a Leslie matrix model, which generated data over a wide range of abundance levels, harvest rates, and natural survival probabilities, representing medium-to-big game species. Results of full age-class and pooled adult age-class population reconstructions were compared for accuracy and precision. No discernible difference in accuracy was detected, but precision was slightly reduced when using the pooled adult age-class reconstruction. On average, the coefficient of variation increased by 0.059 when the adult age-class data were pooled prior to analyses. The analyses and maximum likelihood model for pooled adult age-class reconstruction are illustrated for a black-tailed deer (Odocoileus hemionus) population in Washington State.

Conclusions/Significance

Inventorying wild populations is one of the greatest challenges of wildlife agencies. These new statistical population reconstruction models should expand the demographic capabilities of wildlife agencies that have already collected pooled adult age-class data or are seeking a cost-effective method for monitoring the status and trends of our wild resources. 相似文献

13.

Exploration of phylogenetic data using a global sequence analysis method

Charles?Chapus Christine?Dufraigne Scott?Edwards Alain?Giron Bernard?Fertil Patrick?Deschavanne Email author 《BMC evolutionary biology》2005,5(1):63

Background

Molecular phylogenetic methods are based on alignments of nucleic or peptidic sequences. The tremendous increase in molecular data permits phylogenetic analyses of very long sequences and of many species, but also requires methods to help manage large datasets. 相似文献

14.

Exon array data analysis using Affymetrix power tools and R statistical software

Lockstone HE 《Briefings in bioinformatics》2011,12(6):634-644

相似文献

15.

Improving the statistical detection of regulated genes from microarray data using intensity-based variance estimation

Comander J Natarajan S Gimbrone MA García-Cardeña G 《BMC genomics》2004,5(1):17-21

Background

Gene microarray technology provides the ability to study the regulation of thousands of genes simultaneously, but its potential is limited without an estimate of the statistical significance of the observed changes in gene expression. Due to the large number of genes being tested and the comparatively small number of array replicates (e.g., N = 3), standard statistical methods such as the Student's t-test fail to produce reliable results. Two other statistical approaches commonly used to improve significance estimates are a penalized t-test and a Z-test using intensity-dependent variance estimates. 相似文献

16.

On the statistical analysis of batch data

A. A. Esener J. A. Roels N. W. F. Kossen 《Biotechnology and bioengineering》1981,23(10):2391-2396

相似文献

17.

New developments in CLAMP: Calibration using global gridded meteorological data

R.A. Spicer P.J. Valdes T.E.V. Spicer H.J. Craggs G. Srivastava R.C. Mehrotra J. Yang 《Palaeogeography, Palaeoclimatology, Palaeoecology》2009,283(1-2):91-98

Climate Leaf Analysis Multivariate Program (CLAMP) is a versatile technique for obtaining quantitative estimates for multiple terrestrial palaeoclimate variables from woody dicot leaf assemblages. To date it has been most widely applied to the Late Cretaceous and Tertiary of the mid- to high latitudes because of concerns over the relative dearth of calibration sites in modern low-latitude warm climates, and the loss of information associated with the lack of marginal teeth on leaves in paratropical to tropical vegetation. This limits CLAMP's ability to quantify reliably climates at low latitudes in greenhouse worlds of the past.One of the reasons for the lack of CLAMP calibration samples from warm environments is the paucity of climate stations close to potential calibration vegetation sites at low latitudes. Agriculture and urban development have destroyed most lowland sites and natural vegetation is now largely confined to mountainous areas where climate stations are few and climatic spatial variation is high due to topographic complexity. To attempt to overcome this we have utilised a 0.5° × 0.5° grid of global interpolated climate data based on the data set of New et al. (1999) supplemented by the ERA40 re-analysis data for atmospheric temperature at upper levels. For each location, the 3-D climatology of temperature from the ECMWF re-analysis project was used to calculate the mean lower tropospheric lapse rate for each month of the year. The gridded data were then corrected to the altitude of the plant site using the monthly lapse rates. Corrections for humidity were also made. From this the commonly returned CLAMP climate variables were calculated. A bilinear interpolation scheme was then used to calculate the climate parameters at the exact lat/long of the site.When CLAMP analyses using the PHYSG3BR physiognomic data calibrated with the climate station based MET3BR were compared to analyses using the gridded data at the same locations (GRIDMET3BR), the results were indistinguishable in that they fell within the range of statistical uncertainty determined for each analysis. This opens the way to including natural vegetation anywhere in the world irrespective of the proximity of a meteorological station. 相似文献

18.

Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data 总被引：11，自引：0，他引：11

Wu B Abbott T Fishman D McMurray W Mor G Stone K Ward D Williams K Zhao H 《Bioinformatics (Oxford, England)》2003,19(13):1636-1643

MOTIVATION: Novel methods, both molecular and statistical, are urgently needed to take advantage of recent advances in biotechnology and the human genome project for disease diagnosis and prognosis. Mass spectrometry (MS) holds great promise for biomarker identification and genome-wide protein profiling. It has been demonstrated in the literature that biomarkers can be identified to distinguish normal individuals from cancer patients using MS data. Such progress is especially exciting for the detection of early-stage ovarian cancer patients. Although various statistical methods have been utilized to identify biomarkers from MS data, there has been no systematic comparison among these approaches in their relative ability to analyze MS data. RESULTS: We compare the performance of several classes of statistical methods for the classification of cancer based on MS spectra. These methods include: linear discriminant analysis, quadratic discriminant analysis, k-nearest neighbor classifier, bagging and boosting classification trees, support vector machine, and random forest (RF). The methods are applied to ovarian cancer and control serum samples from the National Ovarian Cancer Early Detection Program clinic at Northwestern University Hospital. We found that RF outperforms other methods in the analysis of MS data. 相似文献

19.

Challenges in using land use and land cover data for global change studies 总被引：5，自引：0，他引：5

PETER H. VERBURG KATHLEEN NEUMANN LINDA NOL 《Global Change Biology》2011,17(2):974-989

Land use and land cover data play a central role in climate change assessments. These data originate from different sources and inventory techniques. Each source of land use/cover data has its own domain of applicability and quality standards. Often data are selected without explicitly considering the suitability of the data for the specific application, the bias originating from data inventory and aggregation, and the effects of the uncertainty in the data on the results of the assessment. Uncertainties due to data selection and handling can be in the same order of magnitude as uncertainties related to the representation of the processes under investigation. While acknowledging the differences in data sources and the causes of inconsistencies, several methods have been developed to optimally extract information from the data and document the uncertainties. These methods include data integration, improved validation techniques and harmonization of classification systems. Based on the data needs of global change studies and the data availability, recommendations are formulated aimed at optimal use of current data and focused efforts for additional data collection. These include: improved documentation using classification systems for land use/cover data; careful selection of data given the specific application and the use of appropriate scaling and aggregation methods. In addition, the data availability may be improved by the combination of different data sources to optimize information content while collection of additional data must focus on validation of available data sets and improved coverage of regions and land cover types with a high level of uncertainty. Specific attention in data collection should be given to the representation of land management (systems) and mosaic landscapes. 相似文献

20.

Deriving sustainability measures using statistical data: A case study from the Eisenwurzen, Austria

Friedrich Putzhuber Hubert Hasenauer 《Ecological Indicators》2010,10(1):32-38

Within the past two decades sustainability has become a key term in emphasizing and understanding relationships between economic progress and the protection of the environment. One key difficulty is in the definition of sustainability indicators based on information at different spatial and temporal scales. In this paper we formalize statistical models for the assessment of sustainability impact indicators using a public data source provided by the Austrian government. Our application example is the Eisenwurzen region in Austria, an old and famous mining area within the Alps. The total area covers 5.743 km² and includes 99 municipalities. In our study we define 15 impact indicators covering economic, social and environmental impacts. For each of the impact indicators we develop response functions using the available public data sources. The results suggest that the available data are an important source for deriving sustainable impact indicators within specific regions. The presented approach may serve as diagnostic tool to provide insights into the regional drivers for assessing sustainability indicators. 相似文献