首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 47 毫秒
1.

Introduction

Direct injection Fourier-transform mass spectrometry (FT-MS) allows for the high-throughput and high-resolution detection of thousands of metabolite-associated isotopologues. However, spectral artifacts can generate large numbers of spectral features (peaks) that do not correspond to known compounds. Misassignment of these artifactual features creates interpretive errors and limits our ability to discern the role of representative features within living systems.

Objectives

Our goal is to develop rigorous methods that identify and handle spectral artifacts within the context of high-throughput FT-MS-based metabolomics studies.

Results

We observed three types of artifacts unique to FT-MS that we named high peak density (HPD) sites: fuzzy sites, ringing and partial ringing. While ringing artifacts are well-known, fuzzy sites and partial ringing have not been previously well-characterized in the literature. We developed new computational methods based on comparisons of peak density within a spectrum to identify regions of spectra with fuzzy sites. We used these methods to identify and eliminate fuzzy site artifacts in an example dataset of paired cancer and non-cancer lung tissue samples and evaluated the impact of these artifacts on classification accuracy and robustness.

Conclusion

Our methods robustly identified consistent fuzzy site artifacts in our FT-MS metabolomics spectral data. Without artifact identification and removal, 91.4% classification accuracy was achieved on an example lung cancer dataset; however, these classifiers rely heavily on artifactual features present in fuzzy sites. Proper removal of fuzzy site artifacts produces a more robust classifier based on non-artifactual features, with slightly improved accuracy of 92.4% in our example analysis.
  相似文献   

2.
Lyu  Chuqiao  Wang  Lei  Zhang  Juhua 《BMC genomics》2018,19(10):905-165

Background

The DNase I hypersensitive sites (DHSs) are associated with the cis-regulatory DNA elements. An efficient method of identifying DHSs can enhance the understanding on the accessibility of chromatin. Despite a multitude of resources available on line including experimental datasets and computational tools, the complex language of DHSs remains incompletely understood.

Methods

Here, we address this challenge using an approach based on a state-of-the-art machine learning method. We present a novel convolutional neural network (CNN) which combined Inception like networks with a gating mechanism for the response of multiple patterns and longterm association in DNA sequences to predict multi-scale DHSs in Arabidopsis, rice and Homo sapiens.

Results

Our method obtains 0.961 area under curve (AUC) on Arabidopsis, 0.969 AUC on rice and 0.918 AUC on Homo sapiens.

Conclusions

Our method provides an efficient and accurate way to identify multi-scale DHSs sequences by deep learning.
  相似文献   

3.

Background

Ocular images play an essential role in ophthalmological diagnoses. Having an imbalanced dataset is an inevitable issue in automated ocular diseases diagnosis; the scarcity of positive samples always tends to result in the misdiagnosis of severe patients during the classification task. Exploring an effective computer-aided diagnostic method to deal with imbalanced ophthalmological dataset is crucial.

Methods

In this paper, we develop an effective cost-sensitive deep residual convolutional neural network (CS-ResCNN) classifier to diagnose ophthalmic diseases using retro-illumination images. First, the regions of interest (crystalline lens) are automatically identified via twice-applied Canny detection and Hough transformation. Then, the localized zones are fed into the CS-ResCNN to extract high-level features for subsequent use in automatic diagnosis. Second, the impacts of cost factors on the CS-ResCNN are further analyzed using a grid-search procedure to verify that our proposed system is robust and efficient.

Results

Qualitative analyses and quantitative experimental results demonstrate that our proposed method outperforms other conventional approaches and offers exceptional mean accuracy (92.24%), specificity (93.19%), sensitivity (89.66%) and AUC (97.11%) results. Moreover, the sensitivity of the CS-ResCNN is enhanced by over 13.6% compared to the native CNN method.

Conclusion

Our study provides a practical strategy for addressing imbalanced ophthalmological datasets and has the potential to be applied to other medical images. The developed and deployed CS-ResCNN could serve as computer-aided diagnosis software for ophthalmologists in clinical application.
  相似文献   

4.

Background

Human Down syndrome (DS) is usually caused by genomic micro-duplications and dosage imbalances of human chromosome 21. It is associated with many genomic and phenotype abnormalities. Even though human DS occurs about 1 per 1,000 births worldwide, which is a very high rate, researchers haven’t found any effective method to cure DS. Currently, the most efficient ways of human DS prevention are screening and early detection.

Methods

In this study, we used deep learning techniques and analyzed a set of Illumina genotyping array data. We built a bi-stream convolutional neural networks model to screen/predict the occurrence of DS. Firstly, we built image input data by converting the intensities of each SNP site into chromosome SNP maps. Next, we proposed a bi-stream convolutional neural network (CNN) architecture with nine layers and two branch models. We further merged two CNN branch models into one model in the fourth convolutional layer, and output the prediction in the last layer.

Results

Our bi-stream CNN model achieved 99.3% average accuracies, and very low false-positive and false-negative rates, which was necessary for further applications in disease prediction and medical practice. We further visualized the feature maps and learned filters from intermediate convolutional layers, which showed the genomic patterns and correlated SNPs variations in human DS genomes. We also compared our methods with other CNN and traditional machine learning models. We further analyzed and discussed the characteristics and strengths of our bi-stream CNN model.

Conclusions

Our bi-stream model used two branch CNN models to learn the local genome features and regional patterns among adjacent genes and SNP sites from two chromosomes simultaneously. It achieved the best performance in all evaluating metrics when compared with two single-stream CNN models and three traditional machine-learning algorithms. The visualized feature maps also provided opportunities to study the genomic markers and pathway components associated with Human DS, which provided insights for gene therapy and genomic medicine developments.
  相似文献   

5.

Background

The protein encoded by the gene ybgI was chosen as a target for a structural genomics project emphasizing the relation of protein structure to function.

Results

The structure of the ybgI protein is a toroid composed of six polypeptide chains forming a trimer of dimers. Each polypeptide chain binds two metal ions on the inside of the toroid.

Conclusion

The toroidal structure is comparable to that of some proteins that are involved in DNA metabolism. The di-nuclear metal site could imply that the specific function of this protein is as a hydrolase-oxidase enzyme.
  相似文献   

6.
Gao S  Xu S  Fang Y  Fang J 《Proteome science》2012,10(Z1):S7

Background

Identification of phosphorylation sites by computational methods is becoming increasingly important because it reduces labor-intensive and costly experiments and can improve our understanding of the common properties and underlying mechanisms of protein phosphorylation.

Methods

A multitask learning framework for learning four kinase families simultaneously, instead of studying each kinase family of phosphorylation sites separately, is presented in the study. The framework includes two multitask classification methods: the Multi-Task Least Squares Support Vector Machines (MTLS-SVMs) and the Multi-Task Feature Selection (MT-Feat3).

Results

Using the multitask learning framework, we successfully identify 18 common features shared by four kinase families of phosphorylation sites. The reliability of selected features is demonstrated by the consistent performance in two multi-task learning methods.

Conclusions

The selected features can be used to build efficient multitask classifiers with good performance, suggesting they are important to protein phosphorylation across 4 kinase families.
  相似文献   

7.
8.

Objectives

To design and fabricate a 3D-printed cervical cage composite of polylactic acid (PLA)/nano-sized and β-tricalcium phosphate (β-TCP).

Results

CAD analysis provided a useful platform to design the preliminary cage. In vitro cell culture and in vivo animal results showed promising results in the biocompatibility of the constructs. Endplate matching evaluation showed better matching degree of 3D-printed cages than those of conventional cages. Biomechanical evaluation showed better mechanical properties of 3D-printed cages than those of conventional cages.

Conclusion

The novel 3D printed PLA/pβ-TCP cage showed good application potential, indicating a novel, feasible, and inexpensive method to manufacture cervical fusion cages.
  相似文献   

9.

Introduction

Mass spectrometry imaging (MSI) experiments result in complex multi-dimensional datasets, which require specialist data analysis tools.

Objectives

We have developed massPix—an R package for analysing and interpreting data from MSI of lipids in tissue.

Methods

massPix produces single ion images, performs multivariate statistics and provides putative lipid annotations based on accurate mass matching against generated lipid libraries.

Results

Classification of tissue regions with high spectral similarly can be carried out by principal components analysis (PCA) or k-means clustering.

Conclusion

massPix is an open-source tool for the analysis and statistical interpretation of MSI data, and is particularly useful for lipidomics applications.
  相似文献   

10.

Background

Adverse drug reactions (ADRs) are unintended and harmful reactions caused by normal uses of drugs. Predicting and preventing ADRs in the early stage of the drug development pipeline can help to enhance drug safety and reduce financial costs.

Methods

In this paper, we developed machine learning models including a deep learning framework which can simultaneously predict ADRs and identify the molecular substructures associated with those ADRs without defining the substructures a-priori.

Results

We evaluated the performance of our model with ten different state-of-the-art fingerprint models and found that neural fingerprints from the deep learning model outperformed all other methods in predicting ADRs. Via feature analysis on drug structures, we identified important molecular substructures that are associated with specific ADRs and assessed their associations via statistical analysis.

Conclusions

The deep learning model with feature analysis, substructure identification, and statistical assessment provides a promising solution for identifying risky components within molecular structures and can potentially help to improve drug safety evaluation.
  相似文献   

11.

Introduction

Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.

Objectives

In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.

Methods

The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.

Results

A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.

Conclusion

The workflow generated repeatable and informative fingerprints for robust metabolome characterization.
  相似文献   

12.

Introduction

Data sharing is being increasingly required by journals and has been heralded as a solution to the ‘replication crisis’.

Objectives

(i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals’ policies to those that publish the most metabolomics papers.

Methods

A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications.

Results

Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data.

Conclusion

Further efforts are required to improve data sharing in metabolomics.
  相似文献   

13.
14.

Introduction

Improving feed utilization in cattle is required to reduce input costs, increase production, and ultimately improve sustainability of the beef cattle industry. Characterizing metabolic differences between efficient and non-efficient animals will allow stakeholders to identify more efficient cattle during backgrounding.

Objectives

This study used an untargeted metabolomics approach to determine differences in serum metabolites between animals of low and high residual feed intake.

Methods

Residual feed intake was determined for 50 purebred Angus steers and 29 steers were selected for the study steers based on low versus high feed efficiency. Blood samples were collected from steers and analyzed using untargeted metabolomics via mass spectrometry. Metabolite data was analyzed using Metaboanalyst, visualized using orthogonal partial least squares discriminant analysis, and p-values derived from permutation testing. Non-esterified fatty acids, urea nitrogen, and glucose were measured using commercially available calorimetric assay kits. Differences in metabolites measured were grouped by residual feed intake was measured using one-way analysis of variance in SAS 9.4.

Results

Four metabolites were found to be associated with differences in feed efficiency. No differences were found in other serum metabolites, including serum urea nitrogen, non-esterified fatty acids, and glucose.

Conclusions

Four metabolites that differed between low and high residual feed intake have important functions related to nutrient utilization, among other functions, in cattle. This information will allow identification of more efficient steers during backgrounding.
  相似文献   

15.

Objective

To examine the activities of residual enzymes in dried shiitake mushrooms, which are a traditional foodstuff in Japanese cuisine, for possible applications in food processing.

Results

Polysaccharide-degrading enzymes remained intact in dried shiitake mushrooms and the activities of amylase, β-glucosidase and pectinase were high. A potato digestion was tested using dried shiitake powder. The enzymes reacted with potato tuber specimens to solubilize sugars even under a heterogeneous solid-state condition and that their reaction modes were different at 38 and 50 °C.

Conclusion

Dried shiitake mushrooms have a potential use in food processing as an enzyme preparation.
  相似文献   

16.

Introduction

The metabolome of a biological system is affected by multiple factors including factor of interest (e.g. metabolic perturbation due to disease) and unwanted factors or factors which are not primarily the focus of the study (e.g. batch effect, gender, and level of physical activity). Removal of these unwanted data variations is advantageous, as the unwanted variations may complicate biological interpretation of the data.

Objectives

We aim to develop a new unwanted variations elimination (UVE) method called clustering-based unwanted residuals elimination (CURE) to reduce metabolic variation caused by unwanted/hidden factors in metabolomic data.

Methods

A mean-centered metabolomic dataset can be viewed as a combination of a studied factor matrix and a residual matrix. The CURE method assumes that the residual should be normally distributed if it only contains inter-individual variation. However, if the residual forms multiple clusters in feature subspace of principal components analysis or partial least squares discriminant analysis, the residual may contain variation due to unwanted factors. This unwanted variation is removed by doing K-means data clustering and removal of means for each cluster from the residuals. The process is iterated until the residual no longer forms multiple clusters in feature subspace.

Results

Three simulated datasets and a human metabolomic dataset were used to demonstrate the performance of the proposed CURE method. CURE was found able to remove most of the variations caused by unwanted factors, while preserving inter-individual variation between samples.

Conclusion

The CURE method can effectively remove unwanted data variation, and can serve as an alternative UVE method for metabolomic data.
  相似文献   

17.

Background

Currently available microscope slide scanners produce whole slide images at various resolutions from histological sections. Nevertheless, acquisition area and so visualization of large tissue samples are limited by the standardized size of glass slides, used daily in pathology departments. The proposed solution has been developed to build composite virtual slides from images of large tumor fragments.

Materials and methods

Images of HES or immunostained histological sections of carefully labeled fragments from a representative slice of breast carcinoma were acquired with a digital slide scanner at a magnification of 20×. The tiling program involves three steps: the straightening of tissue fragment images using polynomial interpolation method, and the building and assembling of strips of contiguous tissue sample whole slide images in × and y directions. The final image is saved in a pyramidal BigTiff file format. The program has been tested on several tumor slices. A correlation quality control has been done on five images artificially cut.

Results

Sixty tumor slices from twenty surgical specimens, cut into two to twenty six pieces, were reconstructed. A median of 98.71% is obtained by computing the correlation coefficients between native and reconstructed images for quality control.

Conclusions

The proposed method is efficient and able to adapt itself to daily work conditions of classical pathology laboratories.
  相似文献   

18.

Background

Our purpose was to study the association between the intracranial atherosclerosis as measured by cavernous carotid artery calcification (ICAC) observed on head CT and atrophic changes of supra-tentorial brain demonstrated by MRI.

Methods

Institutional review board approval was obtained for this retrospective study incorporating 65 consecutive patients presenting acutely who had both head CT and MRI. Arterial calcifications of the intracranial cavernous carotids (ICAC) were assigned a number (1 to 4) in the bone window images from CT scans. These 4 groups were then combined into high (grades 3 and 4) and low calcium (grades 1 and 2) subgroups. Brain MRI was independently evaluated to identify cortical and central atrophy. Demographics and cardiovascular risk factors were evaluated in subjects with high and low ICAC. Relationship between CT demonstrated ICAC and brain atrophy patterns were evaluated both without and with adjustment for cerebral ischemic scores and cardiovascular risk factors.

Results

Forty-six of the 65 (71%) patients had high ICAC on head CT. Subjects with high ICAC were older, and had higher prevalence of hypertension, diabetes, coronary artery disease (CAD), atrial fibrillation and history of previous stroke (CVA) compared to those with low ICAC. Age demonstrated strong correlation with both supratentorial atrophy patterns. There was no correlation between ICAC and cortical atrophy. There was correlation however between central atrophy and ICAC. This persisted even after adjustment for age.

Conclusion

Age is the most important determinant of atrophic cerebral changes. However, high ICAC demonstrated age independent association with central atrophy.
  相似文献   

19.

Introduction

Aqueous–methanol mixtures have successfully been applied to extract a broad range of metabolites from plant tissue. However, a certain amount of material remains insoluble.

Objectives

To enlarge the metabolic compendium, two ionic liquids were selected to extract the methanol insoluble part of trunk from Betula pendula.

Methods

The extracted compounds were analyzed by LC/MS and GC/MS.

Results

The results show that 1-butyl-3-methylimidazolium acetate (IL-Ac) predominantly resulted in fatty acids, whereas 1-ethyl-3-methylimidazolium tosylate (IL-Tos) mostly yielded phenolic structures. Interestingly, bark yielded more ionic liquid soluble metabolites compared to interior wood.

Conclusion

From this one can conclude that the application of ionic liquids may expand the metabolic snapshot.
  相似文献   

20.

Introduction

It is difficult to elucidate the metabolic and regulatory factors causing lipidome perturbations.

Objectives

This work simplifies this process.

Methods

A method has been developed to query an online holistic lipid metabolic network (of 7923 metabolites) to extract the pathways that connect the input list of lipids.

Results

The output enables pathway visualisation and the querying of other databases to identify potential regulators. When used to a study a plasma lipidome dataset of polycystic ovary syndrome, 14 enzymes were identified, of which 3 are linked to ELAVL1—an mRNA stabiliser.

Conclusion

This method provides a simplified approach to identifying potential regulators causing lipid-profile perturbations.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号