首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 78 毫秒
1.
Many biomedical studies have identified important imaging biomarkers that are associated with both repeated clinical measures and a survival outcome. The functional joint model (FJM) framework, proposed by Li and Luo in 2017, investigates the association between repeated clinical measures and survival data, while adjusting for both high-dimensional images and low-dimensional covariates based on the functional principal component analysis (FPCA). In this paper, we propose a novel algorithm for the estimation of FJM based on the functional partial least squares (FPLS). Our numerical studies demonstrate that, compared to FPCA, the proposed FPLS algorithm can yield more accurate and robust estimation and prediction performance in many important scenarios. We apply the proposed FPLS algorithm to a neuroimaging study. Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database.  相似文献   

2.
Ecologists are increasingly asking large‐scale and/or broad‐scope questions that require vast datasets. In response, various top‐down efforts and incentives have been implemented to encourage data sharing and integration. However, despite general consensus on the critical need for more open ecological data, several roadblocks still discourage compliance and participation in these projects; as a result, ecological data remain largely unavailable. Grassroots initiatives (i.e. efforts initiated and led by cohesive groups of scientists focused on specific goals) have thus far been overlooked as a powerful means to meet these challenges. These bottom‐up collaborative data integration projects can play a crucial role in making high quality datasets available because they tackle the heterogeneity of ecological data at a scale where it is still manageable, all the while offering the support and structure to do so. These initiatives foster best practices in data management and provide tangible rewards to researchers who choose to invest time in sound data stewardship. By maintaining proximity between data generators and data users, grassroots initiatives improve data interpretation and ensure high‐quality data integration while providing fair acknowledgement to data generators. We encourage researchers to formalize existing collaborations and to engage in local activities that improve the availability and distribution of ecological data. By fostering communication and interaction among scientists, we are convinced that grassroots initiatives can significantly support the development of global‐scale data repositories. In doing so, these projects help address important ecological questions and support policy decisions.  相似文献   

3.
Data independent acquisition (DIA) proteomics techniques have matured enormously in recent years, thanks to multiple technical developments in, for example, instrumentation and data analysis approaches. However, there are many improvements that are still possible for DIA data in the area of the FAIR (Findability, Accessibility, Interoperability and Reusability) data principles. These include more tailored data sharing practices and open data standards since public databases and data standards for proteomics were mostly designed with DDA data in mind. Here we first describe the current state of the art in the context of FAIR data for proteomics in general, and for DIA approaches in particular. For improving the current situation for DIA data, we make the following recommendations for the future: (i) development of an open data standard for spectral libraries; (ii) make mandatory the availability of the spectral libraries used in DIA experiments in ProteomeXchange resources; (iii) improve the support for DIA data in the data standards developed by the Proteomics Standards Initiative; and (iv) improve the support for DIA datasets in ProteomeXchange resources, including more tailored metadata requirements.  相似文献   

4.
Summary   This paper explores data compatibility issues arising from the assessment of remnant native vegetation condition using satellite remote sensing and field-based data. Space-borne passive remote sensing is increasingly used as a way of providing a total sample and synoptic overview of the spectral and spatial characteristics of native vegetation canopies at a regional scale. However, integrating field-collected data often not designed for integration with remotely sensed data can lead to data compatibility issues. Subsequent problems associated with the integration of unsuited datasets can contribute to data uncertainty and result in inconclusive findings. It is these types of problems (and potential solutions) that form the basis of this paper. In other words, how can field surveys be designed to support and improve compatibility with remotely sensed total surveys? Key criteria were identified for consideration when designing field-based surveys of native vegetation condition (and other similar applications) with the intent to incorporate remotely sensed data. The criteria include recommendations for the siting of plots, the need for reference location plots, the number of sample sites and plot size and distribution, within a study area. The difficulties associated with successfully integrating these data are illustrated using real examples taken from a study of the vegetation in the Little River Catchment, New South Wales, Australia.  相似文献   

5.
Rosner B  Glynn RJ  Lee ML 《Biometrics》2006,62(1):185-192
The Wilcoxon signed rank test is a frequently used nonparametric test for paired data (e.g., consisting of pre- and posttreatment measurements) based on independent units of analysis. This test cannot be used for paired comparisons arising from clustered data (e.g., if paired comparisons are available for each of two eyes of an individual). To incorporate clustering, a generalization of the randomization test formulation for the signed rank test is proposed, where the unit of randomization is at the cluster level (e.g., person), while the individual paired units of analysis are at the subunit within cluster level (e.g., eye within person). An adjusted variance estimate of the signed rank test statistic is then derived, which can be used for either balanced (same number of subunits per cluster) or unbalanced (different number of subunits per cluster) data, with an exchangeable correlation structure, with or without tied values. The resulting test statistic is shown to be asymptotically normal as the number of clusters becomes large, if the cluster size is bounded. Simulation studies are performed based on simulating correlated ranked data from a signed log-normal distribution. These studies indicate appropriate type I error for data sets with > or =20 clusters and a superior power profile compared with either the ordinary signed rank test based on the average cluster difference score or the multivariate signed rank test of Puri and Sen. Finally, the methods are illustrated with two data sets, (i) an ophthalmologic data set involving a comparison of electroretinogram (ERG) data in retinitis pigmentosa (RP) patients before and after undergoing an experimental surgical procedure, and (ii) a nutritional data set based on a randomized prospective study of nutritional supplements in RP patients where vitamin E intake outside of study capsules is compared before and after randomization to monitor compliance with nutritional protocols.  相似文献   

6.
7.
8.
There is an increasing need for life cycle data for bio‐based products, which becomes particularly evident with the recent drive for greenhouse gas reporting and carbon footprinting studies. Meeting this need is challenging given that many bio‐products have not yet been studied by life cycle assessment (LCA), and those that have are specific and limited to certain geographic regions. In an attempt to bridge data gaps for bio‐based products, LCA practitioners can use either proxy data sets (e.g., use existing environmental data for apples to represent pears) or extrapolated data (e.g., derive new data for pears by modifying data for apples considering pear‐specific production characteristics). This article explores the challenges and consequences of using these two approaches. Several case studies are used to illustrate the trade‐offs between uncertainty and the ease of application, with carbon footprinting as an example. As shown, the use of proxy data sets is the quickest and easiest solution for bridging data gaps but also has the highest uncertainty. In contrast, data extrapolation methods may require extensive expert knowledge and are thus harder to use but give more robust results in bridging data gaps. They can also provide a sound basis for understanding variability in bio‐based product data. If resources (time, budget, and expertise) are limited, the use of averaged proxy data may be an acceptable compromise for initial or screening assessments. Overall, the article highlights the need for further research on the development and validation of different approaches to bridging data gaps for bio‐based products.  相似文献   

9.
The improved accessibility to data that can be used in human health risk assessment (HHRA) necessitates advanced methods to optimally incorporate them in HHRA analyses. This article investigates the application of data fusion methods to handling multiple sources of data in HHRA and its components. This application can be performed at two levels, first, as an integrative framework that incorporates various pieces of information with knowledge bases to build an improved knowledge about an entity and its behavior, and second, in a more specific manner, to combine multiple values for a state of a certain feature or variable (e.g., toxicity) into a single estimation. This work first reviews data fusion formalisms in terms of architectures and techniques that correspond to each of the two mentioned levels. Then, by handling several data fusion problems related to HHRA components, it illustrates the benefits and challenges in their application.  相似文献   

10.
Data integration is key to functional and comparative genomics because integration allows diverse data types to be evaluated in new contexts. To achieve data integration in a scalable and sensible way, semantic standards are needed, both for naming things (standardized nomenclatures, use of key words) and also for knowledge representation. The Mouse Genome Informatics database and other model organism databases help to close the gap between information and understanding of biological processes because these resources enforce well-defined nomenclature and knowledge representation standards. Model organism databases have a critical role to play in ensuring that diverse kinds of data, especially genome-scale data sets and information, remain useful to the biological community in the long-term. The efforts of model organism database groups ensure not only that organism-specific data are integrated, curated and accessible but also that the information is structured in such a way that comparison of biological knowledge across model organisms is facilitated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号