首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 828 毫秒
1.

Background

Randomised controlled trials (RCTs) are perceived as the gold-standard method for evaluating healthcare interventions, and increasingly include quality of life (QoL) measures. The observed results are susceptible to bias if a substantial proportion of outcome data are missing. The review aimed to determine whether imputation was used to deal with missing QoL outcomes.

Methods

A random selection of 285 RCTs published during 2005/6 in the British Medical Journal, Lancet, New England Journal of Medicine and Journal of American Medical Association were identified.

Results

QoL outcomes were reported in 61 (21%) trials. Six (10%) reported having no missing data, 20 (33%) reported ≤ 10% missing, eleven (18%) 11%–20% missing, and eleven (18%) reported >20% missing. Missingness was unclear in 13 (21%). Missing data were imputed in 19 (31%) of the 61 trials. Imputation was part of the primary analysis in 13 trials, but a sensitivity analysis in six. Last value carried forward was used in 12 trials and multiple imputation in two. Following imputation, the most common analysis method was analysis of covariance (10 trials).

Conclusion

The majority of studies did not impute missing data and carried out a complete-case analysis. For those studies that did impute missing data, researchers tended to prefer simpler methods of imputation, despite more sophisticated methods being available.
  相似文献   

2.

Introduction

A common problem in metabolomics data analysis is the existence of a substantial number of missing values, which can complicate, bias, or even prevent certain downstream analyses. One of the most widely-used solutions to this problem is imputation of missing values using a k-nearest neighbors (kNN) algorithm to estimate missing metabolite abundances. kNN implicitly assumes that missing values are uniformly distributed at random in the dataset, but this is typically not true in metabolomics, where many values are missing because they are below the limit of detection of the analytical instrumentation.

Objectives

Here, we explore the impact of nonuniformly distributed missing values (missing not at random, or MNAR) on imputation performance. We present a new model for generating synthetic missing data and a new algorithm, No-Skip kNN (NS-kNN), that accounts for MNAR values to provide more accurate imputations.

Methods

We compare the imputation errors of the original kNN algorithm using two distance metrics, NS-kNN, and a recently developed algorithm KNN-TN, when applied to multiple experimental datasets with different types and levels of missing data.

Results

Our results show that NS-kNN typically outperforms kNN when at least 20–30% of missing values in a dataset are MNAR. NS-kNN also has lower imputation errors than KNN-TN on realistic datasets when at least 50% of missing values are MNAR.

Conclusion

Accounting for the nonuniform distribution of missing values in metabolomics data can significantly improve the results of imputation algorithms. The NS-kNN method imputes missing metabolomics data more accurately than existing kNN-based approaches when used on realistic datasets.
  相似文献   

3.

Introduction

The generic metabolomics data processing workflow is constructed with a serial set of processes including peak picking, quality assurance, normalisation, missing value imputation, transformation and scaling. The combination of these processes should present the experimental data in an appropriate structure so to identify the biological changes in a valid and robust manner.

Objectives

Currently, different researchers apply different data processing methods and no assessment of the permutations applied to UHPLC-MS datasets has been published. Here we wish to define the most appropriate data processing workflow.

Methods

We assess the influence of normalisation, missing value imputation, transformation and scaling methods on univariate and multivariate analysis of UHPLC-MS datasets acquired for different mammalian samples.

Results

Our studies have shown that once data are filtered, missing values are not correlated with m/z, retention time or response. Following an exhaustive evaluation, we recommend PQN normalisation with no missing value imputation and no transformation or scaling for univariate analysis. For PCA we recommend applying PQN normalisation with Random Forest missing value imputation, glog transformation and no scaling method. For PLS-DA we recommend PQN normalisation, KNN as the missing value imputation method, generalised logarithm transformation and no scaling. These recommendations are based on searching for the biologically important metabolite features independent of their measured abundance.

Conclusion

The appropriate choice of normalisation, missing value imputation, transformation and scaling methods differs depending on the data analysis method and the choice of method is essential to maximise the biological derivations from UHPLC-MS datasets.
  相似文献   

4.

Introduction

Data processing is one of the biggest problems in metabolomics, given the high number of samples analyzed and the need of multiple software packages for each step of the processing workflow.

Objectives

Merge in the same platform the steps required for metabolomics data processing.

Methods

KniMet is a workflow for the processing of mass spectrometry-metabolomics data based on the KNIME Analytics platform.

Results

The approach includes key steps to follow in metabolomics data processing: feature filtering, missing value imputation, normalization, batch correction and annotation.

Conclusion

KniMet provides the user with a local, modular and customizable workflow for the processing of both GC–MS and LC–MS open profiling data.
  相似文献   

5.

Background

Untargeted mass spectrometry (MS)-based metabolomics data often contain missing values that reduce statistical power and can introduce bias in biomedical studies. However, a systematic assessment of the various sources of missing values and strategies to handle these data has received little attention. Missing data can occur systematically, e.g. from run day-dependent effects due to limits of detection (LOD); or it can be random as, for instance, a consequence of sample preparation.

Methods

We investigated patterns of missing data in an MS-based metabolomics experiment of serum samples from the German KORA F4 cohort (n?=?1750). We then evaluated 31 imputation methods in a simulation framework and biologically validated the results by applying all imputation approaches to real metabolomics data. We examined the ability of each method to reconstruct biochemical pathways from data-driven correlation networks, and the ability of the method to increase statistical power while preserving the strength of established metabolic quantitative trait loci.

Results

Run day-dependent LOD-based missing data accounts for most missing values in the metabolomics dataset. Although multiple imputation by chained equations performed well in many scenarios, it is computationally and statistically challenging. K-nearest neighbors (KNN) imputation on observations with variable pre-selection showed robust performance across all evaluation schemes and is computationally more tractable.

Conclusion

Missing data in untargeted MS-based metabolomics data occur for various reasons. Based on our results, we recommend that KNN-based imputation is performed on observations with variable pre-selection since it showed robust results in all evaluation schemes.
  相似文献   

6.

Background

Identification of common genes associated with comorbid diseases can be critical in understanding their pathobiological mechanism. This work presents a novel method to predict missing common genes associated with a disease pair. Searching for missing common genes is formulated as an optimization problem to minimize network based module separation from two subgraphs produced by mapping genes associated with disease onto the interactome.

Results

Using cross validation on more than 600 disease pairs, our method achieves significantly higher average receiver operating characteristic ROC Score of 0.95 compared to a baseline ROC score 0.60 using randomized data.

Conclusion

Missing common genes prediction is aimed to complete gene set associated with comorbid disease for better understanding of biological intervention. It will also be useful for gene targeted therapeutics related to comorbid diseases. This method can be further considered for prediction of missing edges to complete the subgraph associated with disease pair.
  相似文献   

7.

Background

Single-cell RNA sequencing (scRNA-seq) technology provides an effective way to study cell heterogeneity. However, due to the low capture efficiency and stochastic gene expression, scRNA-seq data often contains a high percentage of missing values. It has been showed that the missing rate can reach approximately 30% even after noise reduction. To accurately recover missing values in scRNA-seq data, we need to know where the missing data is; how much data is missing; and what are the values of these data.

Methods

To solve these three problems, we propose a novel model with a hybrid machine learning method, namely, missing imputation for single-cell RNA-seq (MISC). To solve the first problem, we transformed it to a binary classification problem on the RNA-seq expression matrix. Then, for the second problem, we searched for the intersection of the classification results, zero-inflated model and false negative model results. Finally, we used the regression model to recover the data in the missing elements.

Results

We compared the raw data without imputation, the mean-smooth neighbor cell trajectory, MISC on chronic myeloid leukemia data (CML), the primary somatosensory cortex and the hippocampal CA1 region of mouse brain cells. On the CML data, MISC discovered a trajectory branch from the CP-CML to the BC-CML, which provides direct evidence of evolution from CP to BC stem cells. On the mouse brain data, MISC clearly divides the pyramidal CA1 into different branches, and it is direct evidence of pyramidal CA1 in the subpopulations. In the meantime, with MISC, the oligodendrocyte cells became an independent group with an apparent boundary.

Conclusions

Our results showed that the MISC model improved the cell type classification and could be instrumental to study cellular heterogeneity. Overall, MISC is a robust missing data imputation model for single-cell RNA-seq data.
  相似文献   

8.

Background

Non-Negative Matrix factorization has become an essential tool for feature extraction in a wide spectrum of applications. In the present work, our objective is to extend the applicability of the method to the case of missing and/or corrupted data due to outliers.

Results

An essential property for missing data imputation and detection of outliers is that the uncorrupted data matrix is low rank, i.e. has only a small number of degrees of freedom. We devise a new version of the Bregman proximal idea which preserves nonnegativity and mix it with the Augmented Lagrangian approach for simultaneous reconstruction of the features of interest and detection of the outliers using a sparsity promoting ? 1 penality.

Conclusions

An application to the analysis of gene expression data of patients with bladder cancer is finally proposed.
  相似文献   

9.

Background

To preserve patient anonymity, health register data may be provided as binned data only. Here we consider as example, how to estimate mean survival time after a diagnosis of metastatic colorectal cancer from Norwegian register data on time to death or censoring binned into 30 day intervals. All events occurring in the first three months (90 days) after diagnosis were removed to achieve comparability with a clinical trial. The aim of the paper is to develop and implement a simple, and yet flexible method for analyzing such interval censored and truncated data.

Methods

Considering interval censoring a missing data problem, we implement a simple multiple imputation strategy that allows flexible sensitivity analyses with respect to the shape of the censoring distribution. To allow identification of appropriate parametric models, a χ2-goodness-of-fit test--also imputation based--is derived and supplemented with diagnostic plots. Uncertainty estimates for mean survival times are obtained via a simulation strategy. The validity and statistical efficiency of the proposed method for varying interval lengths is investigated in a simulation study and compared with simpler alternatives.

Results

Mean survival times estimated from the register data ranged from 1.2 (SE = 0.09) to 3.2 (0.31) years depending on period of diagnosis and choice of parametric model. The shape of the censoring distribution within intervals did generally not influence results, whereas the choice of parametric model did, even when different models fit the data equally well. In simulation studies both simple midpoint imputation and multiple imputation yielded nearly unbiased analyses (relative biases of -0.6% to 9.4%) and confidence intervals with near-nominal coverage probabilities (93.4% to 95.7%) for censoring intervals shorter than six months. For 12 month censoring intervals, multiple imputation provided better protection against bias, and coverage probabilities closer to nominal values than simple midpoint imputation.

Conclusion

Binning of event and censoring times should be considered a viable strategy for anonymizing register data on survival times, as they may be readily analyzed with methods based on multiple imputation.
  相似文献   

10.

Background

Non-negative matrix factorization (NMF) has been introduced as an important method for mining biological data. Though there currently exists packages implemented in R and other programming languages, they either provide only a few optimization algorithms or focus on a specific application field. There does not exist a complete NMF package for the bioinformatics community, and in order to perform various data mining tasks on biological data.

Results

We provide a convenient MATLAB toolbox containing both the implementations of various NMF techniques and a variety of NMF-based data mining approaches for analyzing biological data. Data mining approaches implemented within the toolbox include data clustering and bi-clustering, feature extraction and selection, sample classification, missing values imputation, data visualization, and statistical comparison.

Conclusions

A series of analysis such as molecular pattern discovery, biological process identification, dimension reduction, disease prediction, visualization, and statistical comparison can be performed using this toolbox.
  相似文献   

11.

Introduction

Allograft rejection is still an important complication after kidney transplantation. Currently, monitoring of these patients mostly relies on the measurement of serum creatinine and clinical evaluation. The gold standard for diagnosing allograft rejection, i.e. performing a renal biopsy is invasive and expensive. So far no adequate biomarkers are available for routine use.

Objectives

We aimed to develop a urine metabolite constellation that is characteristic for acute renal allograft rejection.

Methods

NMR-Spectroscopy was applied to a training cohort of transplant recipients with and without acute rejection.

Results

We obtained a metabolite constellation of four metabolites that shows promising performance to detect renal allograft rejection in the cohorts used (AUC of 0.72 and 0.74, respectively).

Conclusion

A metabolite constellation was defined with the potential for further development of an in-vitro diagnostic test that can support physicians in their clinical assessment of a kidney transplant patient.
  相似文献   

12.

Background

Cerebral infarction caused by different reasons seems differ in fibrinogen levels, so the current work intends to explore the relationship between the fibrinogen level and subtypes of the TOAST criteria in the acute stage of ischemic stroke.

Methods

A total of 577 case research objects were treated acute ischemic stroke patients in our hospital from December 2008 to December 2010, and blood samples within 72 hours of the onset were processed with the fibrinogen (PT-der) measurement. Classification of selected patients according to the TOAST Criteria was conducted to study the distribution of fibrinogen levels in the stroke subtypes.

Results

The distribution of fibrinogen levels in the subtypes was observed to be statistically insignificant.

Conclusions

In the acute stage of ischemic stroke, fibrinogen level was not related to the subtypes of the TOAST criteria.
  相似文献   

13.

Introduction

Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.

Objectives

In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.

Methods

The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.

Results

A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.

Conclusion

The workflow generated repeatable and informative fingerprints for robust metabolome characterization.
  相似文献   

14.

Background

Patients presenting with bilateral trigeminal hypoesthesia may go on to have trigeminal isolated sensory neuropathy, a benign, purely trigeminal neuropathy, or facial-onset sensory motor neuronopathy (FOSMN), a malignant life-threatening condition. No diagnostic criteria can yet differentiate the two conditions at their onset. Nor is it clear whether the two diseases are distinct entities or share common pathophysiological mechanisms.

Methods

Seeking pathophysiological and diagnostic information to distinguish these two conditions at their onset, in this neurophysiological and morphometric study we neurophysiologically assessed function in myelinated and unmyelinated fibres and histologically examined supraorbital nerve biopsy specimens with optic and electron microscopy in 13 consecutive patients with recent onset trigeminal hypoesthesia and pain.

Results

The disease course distinctly differed in the 13 patients. During a mean 10 year follow-up whereas in eight patients the disease remained relatively stable, in the other five it progressed to possibly life-threatening motor disturbances and extra-trigeminal spread. From two to six years elapsed between the first sensory symptoms and the onset of motor disorders. In patients with trigeminal isolated sensory neuropathy (TISN) and in those with FOSMN neurophysiological and histological examination documented a neuronopathy manifesting with trigeminal nerve damage selectively affecting myelinated fibres, but sparing the Ia-fibre-mediated proprioceptive reflex.

Conclusions

Although no clinical diagnostic criteria can distinguish the two conditions at onset, neurophysiological and nerve-biopsy findings specify that in both disorders trigeminal nerve damage manifests as a dissociated neuronopathy affecting myelinated and sparing unmyelinated fibres, thus suggesting similar pathophysiological mechanisms.
  相似文献   

15.

Background

Despite its importance in affecting adult pain, and disability, there is a lack of universal criteria for the diagnosis and evaluation of thoraco-lumbar Junctional Kyphosis (JK) and a gold standard measurement and diagnostic system does not exist.This study aims to verify the sensibility and specificity of clinical, and Formetric surface topography (FST) data in identifying Junctional Kyphosis in respect to the radiographical standard references.

Methods

Design: This is a cross sectional study from a prospective database started in March 2003.Participants: 38 subjects.
Inclusion criteria: Patients selected by age according to Risser score 1, at first visit with lateral x-rays and FST. Diagnostic test used to detect JK:
  • FST criteria: level of thoraco-lumbar inflexion point in percentage compared to the total height of the spine.
  • X-ray criteria: lower limit of thoracic kyphosis below T12.
Statistics: sensitivity, specificity, positive (PPV) and negative predictive values (NPV), ROC curve.

Results

FST showed a good reliability in detecting JK: with a threshold of 75 %, PPV was 100 %, NPV was 86 % and the Area Under the Curve was 83 %.

Conclusion

The need for a useful criteria able to characterize JK to allow diagnosis and monitoring of the deformity is still lacking, and further studies will deepen this issue.
  相似文献   

16.

Introduction

Data sharing is being increasingly required by journals and has been heralded as a solution to the ‘replication crisis’.

Objectives

(i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals’ policies to those that publish the most metabolomics papers.

Methods

A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications.

Results

Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data.

Conclusion

Further efforts are required to improve data sharing in metabolomics.
  相似文献   

17.

Introduction

Untargeted metabolomics is a powerful tool for biological discoveries. To analyze the complex raw data, significant advances in computational approaches have been made, yet it is not clear how exhaustive and reliable the data analysis results are.

Objectives

Assessment of the quality of raw data processing in untargeted metabolomics.

Methods

Five published untargeted metabolomics studies, were reanalyzed.

Results

Omissions of at least 50 relevant compounds from the original results as well as examples of representative mistakes were reported for each study.

Conclusion

Incomplete raw data processing shows unexplored potential of current and legacy data.
  相似文献   

18.

Purpose

Mirizzi syndrome is a rare complication of long standing cholelithiasis. The purpose of this study is to retrospectively estimate the diagnostic and treatment methods applied in patients with Mirizzi syndrome.

Materials and methods

Our experience with 27 cases with Mirizzi syndrome is presented. They were diagnosed either by imaging techniques, or during surgical operation. All of the patients were managed surgically.

Results

8 patients were diagnosed preoperatively and the rest intraoperatively. Morbidity rate after surgery was 18,5%, and mortality rate was zero. The patients presented free of symptoms three months after surgery during the follow-up.

Conclusion

Mirizzi syndrome is rarely diagnosed preoperatively and US proved inadequate for this purpose. Surgery is the only therapy and usually provides additionally definitive diagnosis.
  相似文献   

19.

Background

Recently, some studies demonstrated that HMGB1, as proinflammatory mediator belonging to the alarmin family, has a key role in different acute and chronic immune disorders. Asthma is a complex disease characterised by recurrent and reversible airflow obstruction associated to airway hyper-responsiveness and airway inflammation.

Objective

This literature review aims to analyse advances on HMGB1 role, employment and potential diagnostic application in asthma.

Methods

We reviewed experimental studies that investigated the pathogenetic role of HMGB in bronchial airway hyper-responsiveness, inflammation and the correlation between HMGB1 level and asthma.

Results

A total of 19 studies assessing the association between HMGB1 and asthma were identified.

Conclusions

What emerged from this literature review was the confirmation of HMGB-1 involvement in diseases characterised by chronic inflammation, especially in pulmonary pathologies. Findings reported suggest a potential role of the alarmin in being a stadiation method and a marker of therapeutic efficacy; finally, inhibiting HMGB1 in humans in order to contrast inflammation should be the aim for future further studies.
  相似文献   

20.

Introduction

It is difficult to elucidate the metabolic and regulatory factors causing lipidome perturbations.

Objectives

This work simplifies this process.

Methods

A method has been developed to query an online holistic lipid metabolic network (of 7923 metabolites) to extract the pathways that connect the input list of lipids.

Results

The output enables pathway visualisation and the querying of other databases to identify potential regulators. When used to a study a plasma lipidome dataset of polycystic ovary syndrome, 14 enzymes were identified, of which 3 are linked to ELAVL1—an mRNA stabiliser.

Conclusion

This method provides a simplified approach to identifying potential regulators causing lipid-profile perturbations.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号