期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

共查询到20条相似文献，搜索用时 0 毫秒

A cross-validation statistical framework for asymmetric data integration

Lam Tran Kevin He Di Wang Hui Jiang 《Biometrics》2023,79(2):1280-1292

The proliferation of biobanks and large public clinical data sets enables their integration with a smaller amount of locally gathered data for the purposes of parameter estimation and model prediction. However, public data sets may be subject to context-dependent confounders and the protocols behind their generation are often opaque; naively integrating all external data sets equally can bias estimates and lead to spurious conclusions. Weighted data integration is a potential solution, but current methods still require subjective specifications of weights and can become computationally intractable. Under the assumption that local data are generated from the set of unknown true parameters, we propose a novel weighted integration method based upon using the external data to minimize the local data leave-one-out cross validation (LOOCV) error. We demonstrate how the optimization of LOOCV errors for linear and Cox proportional hazards models can be rewritten as functions of external data set integration weights. Significant reductions in estimation error and prediction error are shown using simulation studies mimicking the heterogeneity of clinical data as well as a real-world example using kidney transplant patients from the Scientific Registry of Transplant Recipients. 相似文献

Assessing the power of informative subsets of loci for population assignment: standard methods are upwardly biased

Anderson EC 《Molecular ecology resources》2010,10(4):701-710

It is well known that statistical classification procedures should be assessed using data that are separate from those used to train the classifier. This principle is commonly overlooked when the classification procedure in question is population assignment using a set of genetic markers that were chosen specifically on the basis of their allele frequencies from amongst a larger number of candidate markers. This oversight leads to a systematic upward bias in the predicted accuracy of the chosen set of markers for population assignment. Three widely used software programs for selecting markers informative for population assignment suffer from this bias. The extent of this bias is documented through a small set of simulations. The relative effect of the bias is largest when screening many candidate loci from poorly differentiated populations. Simple unbiased methods are presented and their use encouraged. 相似文献

Cell culture process metabolomics together with multivariate data analysis tools opens new routes for bioprocess development and glycosylation prediction

Philipp Zürcher Michael Sokolov David Brühlmann Raphael Ducommun Matthieu Stettler Jonathan Souquet Martin Jordan Hervé Broly Massimo Morbidelli Alessandro Butté 《Biotechnology progress》2020,36(5):e3012

Multivariate latent variable methods have become a popular and versatile toolset to analyze bioprocess data in industry and academia. This work spans such applications from the evaluation of the role of the standard process variables and metabolites to the metabolomics level, that is, to the extensive number metabolic compounds detectable in the extracellular and intracellular domains. Given the substantial effort currently required for the measurement of the latter groups, a tailored methodology is presented that is capable of providing valuable process insights as well as predicting the glycosylation profile based on only four experiments measured over 12 cell culture days. An important result of the work is the possibility to accurately predict many of the glycan variables based on the information of three experiments. An additional finding is that such predictive models can be generated from the more accessible process and extracellular information only, that is, without including the more experimentally cumbersome intracellular data. With regards to the incorporation of omics data in the standard process analytics framework in the future, this works provides a comprehensive data analysis pathway which can efficiently support numerous bioprocessing tasks. 相似文献

The origin of correlations in metabolomics data 总被引：7，自引：0，他引：7

Diogo?Camacho Alberto?de la?Fuente Pedro?Mendes Email author 《Metabolomics : Official journal of the Metabolomic Society》2005,1(1):53-63

A phenomenon observed earlier in the development of metabolomics as a systems biology methodology, consists of a small but significant number of metabolites whose levels are highly correlated between biological replicates. Contrary to initial interpretations, these correlations are not necessarily only between neighboring metabolites in the metabolic network. Most metabolites that participate in common reactions are not correlated in this way, while some non-neighboring metabolites are highly correlated. Here we investigate the origin of such correlations using metabolic control analysis and computer simulation of biochemical networks. A series of cases is identified which lead to high correlation between metabolite pairs in replicate measurement. These are (1) chemical equilibrium, (2) mass conservation, (3) asymmetric control distribution, and (4) unusually high variance in the expression of a single gene. The importance of identifying metabolite correlations within a physiological state and changes of correlation between different states is discussed in the context of systems biology. 相似文献

(Re-)use and (re-)analysis of publicly available metabolomics data

Michael Witting 《Proteomics》2023,23(23-24):2300032

相似文献

Domain selection and familywise error rate for functional data: A unified framework

Konrad Abramowicz Alessia Pini Lina Schelin Sara Sjöstedt de Luna Aymeric Stamm Simone Vantini 《Biometrics》2023,79(2):1119-1132

Functional data are smooth, often continuous, random curves, which can be seen as an extreme case of multivariate data with infinite dimensionality. Just as componentwise inference for multivariate data naturally performs feature selection, subsetwise inference for functional data performs domain selection. In this paper, we present a unified testing framework for domain selection on populations of functional data. In detail, p-values of hypothesis tests performed on pointwise evaluations of functional data are suitably adjusted for providing control of the familywise error rate (FWER) over a family of subsets of the domain. We show that several state-of-the-art domain selection methods fit within this framework and differ from each other by the choice of the family over which the control of the FWER is provided. In the existing literature, these families are always defined a priori. In this work, we also propose a novel approach, coined thresholdwise testing, in which the family of subsets is instead built in a data-driven fashion. The method seamlessly generalizes to multidimensional domains in contrast to methods based on a priori defined families. We provide theoretical results with respect to consistency and control of the FWER for the methods within the unified framework. We illustrate the performance of the methods within the unified framework on simulated and real data examples and compare their performance with other existing methods. 相似文献

Integration of transcriptomics and metabolomics data specifies the metabolic response of Chlamydomonas to rapamycin treatment

下载免费PDF全文

Sabrina Kleessen Susann Irgang Sebastian Klie Patrick Giavalisco Zoran Nikoloski 《The Plant journal : for cell and molecular biology》2015,81(5):822-835

相似文献

一种模型模拟结果的统计检验方法

周继华来利明郑元润《生态学报》2015,35(19):6435-6438

模拟结果的准确性是衡量生态学模型是否成功的关键,但采用统计学方法判别模型模拟结果与观察值相符程度的报道较少。根据两个直线回归方程能否合并为一个方程的统计学检验方法,提出了通过检验观察值与模拟值直线回归方程和1∶1直线方程截距与斜率是否相同,进而在统计显著水平上判断生态学模型模拟值与观察值一致性的统计学检验方法。数据检验表明,此方法可以较好解决判断生态学模型模拟结果准确性的问题。相似文献

Effects of long-time series of data on genetic evaluations for performance of Swedish Warmblood riding horses

Viklund A Näsholm A Strandberg E Philipsson J 《Animal : an international journal of animal bioscience》2010,4(11):1823-1831

For Swedish Warmblood sport horses, breeding values (BVs) are predicted using a multiple-trait animal model with results from competitions and young horse performance tests. Data go back to the beginning of the 1970s, and earlier studies have indicated that some of the recorded traits have changed through the years. The objective of this study was to investigate the effects of including all performance data or excluding the older ones compared to a bivariate model (BM) considering performance traits in early and late periods as separate traits. The bivariate approach was assumed to give the most correct BVs for the actual breeding population. Competition results in dressage and show jumping for almost 40 000 horses until 2006 were available. For riding horse quality test (RHQT), data of 14 000 horses judged between 1973 and 2007 were used. Genetic correlations of 0.69 to 1.00 were estimated between traits recorded at different time periods (RHQT data) or different birth year groups (competition data). A cross-validation study and comparison of BVs using different sets of data showed that most accurate and similar results were obtained when BVs were predicted from either the BM or the univariate model including all data from the beginning of the recording. We recommend using all data and applying the univariate model to minimise the computational efforts for genetic evaluations and for provision of reliable BVs for as many horses as possible. 相似文献

10.

A new generation of crystallographic validation tools for the protein data bank

Read RJ Adams PD Arendall WB Brunger AT Emsley P Joosten RP Kleywegt GJ Krissinel EB Lütteke T Otwinowski Z Perrakis A Richardson JS Sheffler WH Smith JL Tickle IJ Vriend G Zwart PH 《Structure (London, England : 1993)》2011,19(10):1395-1412

This report presents the conclusions of the X-ray Validation Task Force of the worldwide Protein Data Bank (PDB). The PDB has expanded massively since current criteria for validation of deposited structures were adopted, allowing a much more sophisticated understanding of all the components of macromolecular crystals. The size of the PDB creates new opportunities to validate structures by comparison with the existing database, and the now-mandatory deposition of structure factors creates new opportunities to validate the underlying diffraction data. These developments highlighted the need for a new assessment of validation criteria. The Task Force recommends that a small set of validation data be presented in an easily understood format, relative to both the full PDB and the applicable resolution class, with greater detail available to interested users. Most importantly, we recommend that referees and editors judging the quality of structural experiments have access to a concise summary of well-established quality indicators. 相似文献

11.

Discovery of urine biomarkers for bladder cancer via global metabolomics

Hangchuan Shi Xiang Li Qingyang Zhang Hongmei Yang 《Biomarkers》2016,21(7):578-588

Bladder cancer (BC) is latent in its early stage and lethal in its late stage. Therefore, early diagnosis and intervention are essential for successful BC treatment. Considering the limitations of current diagnostic tools, noninvasive biomarkers that are both highly sensitive and specific are needed to improve the overall survival and quality of life of patients. With the advent of systems biology, “-omics” technologies have been developed over the past few decades. As a promising member, global metabolomics has increasingly been found to have clear potential for biomarker discovery. However, urinary metabolomics studies related to BC have lagged behind those of other urinary cancers, and major findings have not been systematically reported. The objective of this review is to comprehensively list the currently identified potential urinary metabolite biomarkers for BC. 相似文献

12.

Assessing the performance of qpAdm: a statistical tool for studying population admixture

adaoin Harney Nick Patterson David Reich John Wakeley 《Genetics》2021,217(4)

qpAdm is a statistical tool for studying the ancestry of populations with histories that involve admixture between two or more source populations. Using qpAdm, it is possible to identify plausible models of admixture that fit the population history of a group of interest and to calculate the relative proportion of ancestry that can be ascribed to each source population in the model. Although qpAdm is widely used in studies of population history of human (and nonhuman) groups, relatively little has been done to assess its performance. We performed a simulation study to assess the behavior of qpAdm under various scenarios in order to identify areas of potential weakness and establish recommended best practices for use. We find that qpAdm is a robust tool that yields accurate results in many cases, including when data coverage is low, there are high rates of missing data or ancient DNA damage, or when diploid calls cannot be made. However, we caution against co-analyzing ancient and present-day data, the inclusion of an extremely large number of reference populations in a single model, and analyzing population histories involving extended periods of gene flow. We provide a user guide suggesting best practices for the use of qpAdm. 相似文献

13.

高压蒸汽灭菌柜性能验证方法的研究

李国晏徐斌崔萱林孙一枚周丽娟《微生物学免疫学进展》2002,30(2):44-47

高压蒸汽灭菌柜在使用之前和运行一定时间后,必须进行性能验证,采用热分布,热穿透和微生物挑战试验法对GE,GEV型脉冲式蒸汽灭菌柜的性能进行验证,多孔物质及流体物质灭菌循环中,灭菌腔内不存在冷点,具备有效的热穿透力,嗜热脂肪芽孢杆菌（ATCC7953）在规定灭菌时间内被完全杀死。因而确认高压灭菌柜的各项性能均达到生产要求,由此建立的一套验证方案及试验方法和结果得到了国家GMP认证中心的认可。相似文献

14.

Checking the geographical origin of oak wood: molecular and statistical tools

Deguilloux MF Pemonge MH Bertel L Kremer A Petit RJ 《Molecular ecology》2003,12(6):1629-1636

New methods for better identification of timber geographical origin would constitute an important technical element in the forest industry, for phytosanitary certification procedures or in the chain of custody developed for the certification of timber from sustainably managed forests. In the case of the European white oaks, a detailed reference map of chloroplast (cp) DNA variation across the range exists, and we propose here to use the strong geographical structure, characterized by a differentiation of western vs. eastern populations, for the purpose of oak wood traceability. We first developed cpDNA markers permitting the characterization of haplotype on degraded DNA obtained from wood samples. The techniques were subsequently validated by confirming the full correspondence between genotypes obtained from living tissues (buds) and from wood collected from the same individual oak. Finally, a statistical procedure was used to test if the haplotype composition of a lot of wood samples is consistent with its presumed geographical origin. Clearly, the technique cannot permit the unambiguous identification of wood products of unknown origin but can be used to check the conformity of genetic composition of wood samples with the region of alleged origin. This could lead to major applications not only in the forest industry but also in archaeology or in palaeobotany. 相似文献

15.

Integrating metabolomics and transcriptomics data to discover a biocatalyst that can generate the amine precursors for alkamide biosynthesis

下载免费PDF全文

Ludmila Rizhsky Huanan Jin Michael R. Shepard Harry W. Scott Alicen M. Teitgen M. Ann Perera Vandana Mhaske Adarsh Jose Xiaobin Zheng Matt Crispin Eve S. Wurtele Dallas Jones Manhoi Hur Elsa Góngora‐Castillo C. Robin Buell Robert E. Minto Basil J. Nikolau 《The Plant journal : for cell and molecular biology》2016,88(5):775-793

相似文献

16.

Proteomics and metabolomics for analysis of the dynamics of microbiota

Alex Van Belkum David Broadwell Douglas Lovern Lauren Petersen George Weinstock Dr. W. Michael Dunne Jr. 《Expert review of proteomics》2018,15(2):101-104

相似文献

17.

代谢组学在乳酸菌研究中的应用

杨慧步雨珊易华西《天然产物研究与开发》2019,(8):1474-1479,1349

代谢组学是系统生物学的重要分支,因其高效、高通量等特点而广泛应用于食品科学、药物学等研究领域。本文概述了代谢组学的分离和检测技术,综述了代谢组学在乳酸菌鉴定、发酵调控、肠道菌群研究等方面中的应用,对代谢组学在乳酸菌研究中潜在的问题和未来发展趋势进行了讨论,期望为代谢组学在食品工业微生物中的应用提供参考。相似文献

18.

Metabolomics Standards Workshop and the development of international standards for reporting metabolomics experimental results 总被引：5，自引：0，他引：5

Castle AL Fiehn O Kaddurah-Daouk R Lindon JC 《Briefings in bioinformatics》2006,7(2):159-165

Informatics standards and controlled vocabularies are essentialfor allowing information technology to help exchange, manage,interpret and compare large data collections. In a rapidly evolvingfield, the challenge is to work out how best to describe, butnot prescribe, the use of these technologies and methods. AMetabolomics Standards Workshop was held by the US NationalInstitutes of Health (NIH) to bring together multiple ongoingstandards efforts in metabolomics with the NIH research community.The goals were to discuss metabolomics workflows (methods, technologiesand data treatments) and the needs, challenges and potentialapproaches to developing a Metabolomics Standards Initiativethat will help facilitate this rapidly growing field which hasbeen a focus of the NIH roadmap effort. This report highlightsspecific aspects of what was presented and discussed at the1st and 2nd August 2005 Metabolomics Standards Workshop. 相似文献

19.

植物应答非生物胁迫的代谢组学研究进展   总被引：4，自引：0，他引：4       下载免费PDF全文

滕中秋   付卉青   贾少华   孟薇薇   戴荣继   邓玉林《植物生态学报》2011,35(1):110-118

代谢组学技术是研究植物代谢的理想平台, 通过现代检测分析技术对胁迫环境下植物中代谢产物进行定性和定量分析, 可以监测其随时间变化的规律。而各种组学平台包括基因组学、转录组学及代谢组学的整合, 更是一个强有力的工具箱, 将所获得的不同组学的信息联系起来, 有利于从整体研究生物系统对基因或环境变化的响应, 如可判断代谢物的变化是从哪一个层面开始发生的, 帮助人们揭开复杂的植物胁迫应答机制。该文对近期代谢组学技术及其与蛋白质组学、基因组学技术相结合探索植物应答非生物胁迫的研究进行了综述。代谢组学的应用, 拓展了对植物耐受非生物胁迫分子机制的认识, 开展更多这方面的研究, 再通过植物代谢组学、转录组学、蛋白质组学和基因组学整合, 有助于从整体水平上把握植物胁迫应答机制。  相似文献

20.

基于小波低频系数基因芯片数据的特征提取

刘玉杰   刘毅慧《生物信息学》2011,9(3):255-258,262

特征提取和分类是模式识别中的关键问题。结合小波分析理论和支持向量机理论,构造分类器模型,将前列腺癌基因芯片数据分成癌症和正常两种。提取小波低频系数表征原始数据并送入支持向量机分类器分类,实验证明:提取db1小波4层分解下的低频系数,送入分类器分类后正确分类率达到93.53%。Haar小波的正确率是92.94%。可见提取不同小波低频系数,得到的分类效果相差不大。  相似文献