首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MOTIVATION: Application of mass spectrometry in proteomics is a breakthrough in high-throughput analyses. Early applications have focused on protein expression profiles to differentiate among various types of tissue samples (e.g. normal versus tumor). Here our goal is to use mass spectra to differentiate bacterial species using whole-organism samples. The raw spectra are similar to spectra of tissue samples, raising some of the same statistical issues (e.g. non-uniform baselines and higher noise associated with higher baseline), but are substantially noisier. As a result, new preprocessing procedures are required before these spectra can be used for statistical classification. RESULTS: In this study, we introduce novel preprocessing steps that can be used with any mass spectra. These comprise a standardization step and a denoising step. The noise level for each spectrum is determined using only data from that spectrum. Only spectral features that exceed a threshold defined by the noise level are subsequently used for classification. Using this approach, we trained the Random Forest program to classify 240 mass spectra into four bacterial types. The method resulted in zero prediction errors in the training samples and in two test datasets having 240 and 300 spectra, respectively.  相似文献   

2.
癌症的早期诊断能够显著提高癌症患者的存活率,在肝细胞癌患者中这种情况更加明显。机器学习是癌症分类中的有效工具。如何在复杂和高维的癌症数据集中,选择出低维度、高分类精度的特征子集是癌症分类的难题。本文提出了一种二阶段的特征选择方法SC-BPSO:通过组合Spearman相关系数和卡方独立检验作为过滤器的评价函数,设计了一种新型的过滤器方法——SC过滤器,再组合SC过滤器方法和基于二进制粒子群算法(BPSO)的包裹器方法,从而实现两阶段的特征选择。并应用在高维数据的癌症分类问题中,区分正常样本和肝细胞癌样本。首先,对来自美国国家生物信息中心(NCBI)和欧洲生物信息研究所(EBI)的130个肝组织microRNA序列数据(64肝细胞癌,66正常肝组织)进行预处理,使用MiRME算法从原始序列文件中提取microRNA的表达量、编辑水平和编辑后表达量3类特征。然后,调整SC-BPSO算法在肝细胞癌分类场景中的参数,选择出关键特征子集。最后,建立分类模型,预测结果,并与信息增益过滤器、信息增益率过滤器、BPSO包裹器特征选择算法选出的特征子集,使用相同参数的随机森林、支持向量机、决策树、KNN四种分类器分类,对比分类结果。使用SC-BPSO算法选择出的特征子集,分类准确率高达98.4%。研究结果表明,与另外3个特征选择算法相比,SC-BPSO算法能有效地找到尺寸较小和精度更高的特征子集。这对于少量样本高维数据的癌症分类问题可能具有重要意义。  相似文献   

3.
The complex structural organization of the aortic valve (AV) extracellular matrix (ECM) enables large and highly nonlinear tissue level deformations. The collagen and elastin (elastic) fibers within the ECM form an interconnected fibrous network (FN) and are known to be the main load-bearing elements of the AV matrix. The role of the FN in enabling deformation has been investigated and documented. However, there is little data on the correlation between tissue level and FN-level strains. Investigating this correlation will help establish the mode of strain transfer (affine or nonaffine) through the AV tissue as a key feature in microstructural modeling and will also help characterize the local FN deformation across the AV sample in response to applied tissue level strains. In this study, the correlation between applied strains at tissue level, macrostrains across the tissue surface, and local FN strains were investigated. Results showed that the FN strain distribution across AV samples was inhomogeneous and nonuniform, as well as anisotropic. There was no direct transfer of the deformation applied at tissue level to the fibrous network. Loading modes induced in the FN are different than those applied at the tissue as a result of different local strains in the valve layers. This nonuniformity of local strains induced internal shearing within the FN of the AV, possibly exposing the aortic valve interstitial cells (AVICs) to shear strains and stresses.  相似文献   

4.
5.
6.
7.
Acar E  Plopper GE  Yener B 《PloS one》2012,7(3):e32227
The structure/function relationship is fundamental to our understanding of biological systems at all levels, and drives most, if not all, techniques for detecting, diagnosing, and treating disease. However, at the tissue level of biological complexity we encounter a gap in the structure/function relationship: having accumulated an extraordinary amount of detailed information about biological tissues at the cellular and subcellular level, we cannot assemble it in a way that explains the correspondingly complex biological functions these structures perform. To help close this information gap we define here several quantitative temperospatial features that link tissue structure to its corresponding biological function. Both histological images of human tissue samples and fluorescence images of three-dimensional cultures of human cells are used to compare the accuracy of in vitro culture models with their corresponding human tissues. To the best of our knowledge, there is no prior work on a quantitative comparison of histology and in vitro samples. Features are calculated from graph theoretical representations of tissue structures and the data are analyzed in the form of matrices and higher-order tensors using matrix and tensor factorization methods, with a goal of differentiating between cancerous and healthy states of brain, breast, and bone tissues. We also show that our techniques can differentiate between the structural organization of native tissues and their corresponding in vitro engineered cell culture models.  相似文献   

8.
Systematic manipulation of a cell microenvironment with micro- and nanoscale resolution is often required for deciphering various cellular and molecular phenomena. To address this requirement, we have developed a plasma lithography technique to manipulate the cellular microenvironment by creating a patterned surface with feature sizes ranging from 100 nm to millimeters. The goal of this technique is to be able to study, in a controlled way, the behaviors of individual cells as well as groups of cells and their interactions.This plasma lithography method is based on selective modification of the surface chemistry on a substrate by means of shielding the contact of low-temperature plasma with a physical mold. This selective shielding leaves a chemical pattern which can guide cell attachment and movement. This pattern, or surface template, can then be used to create networks of cells whose structure can mimic that found in nature and produces a controllable environment for experimental investigations. The technique is well suited to studying biological phenomenon as it produces stable surface patterns on transparent polymeric substrates in a biocompatible manner. The surface patterns last for weeks to months and can thus guide interaction with cells for long time periods which facilitates the study of long-term cellular processes, such as differentiation and adaption. The modification to the surface is primarily chemical in nature and thus does not introduce topographical or physical interference for interpretation of results. It also does not involve any harsh or toxic substances to achieve patterning and is compatible for tissue culture. Furthermore, it can be applied to modify various types of polymeric substrates, which due to the ability to tune their properties are ideal for and are widely used in biological applications. The resolution achievable is also beneficial, as isolation of specific processes such as migration, adhesion, or binding allows for discrete, clear observations at the single to multicell level.This method has been employed to form diverse networks of different cell types for investigations involving migration, signaling, tissue formation, and the behavior and interactions of neurons arraigned in a network.  相似文献   

9.
Endocrine gland-derived vascular endothelial growth factor (EG-VEGF) was recently identified as the first tissue-specific angiogenic molecule. EG-VEGF (the gene product of PROK-1) appears to be expressed exclusively in steroid-producing organs such as the ovary, testis, adrenals and placenta. Since the human pancreatic cells retain steroidogenic activity, in the present study we ascertained whether this angiogenic factor is expressed in normal pancreas and pancreatic adenocarcinoma. Tissue samples from normal males (n=5), normal females (n=5) and from surgically resected adenocarcinomas (n=2) were processed for RT-PCR and immunohistochemical studies. Results from semi-quantitative analysis by RT-PCR suggest a distinct expression level for EG-VEGF in the different tissue samples. The relative amount of EG-VEGF mRNA in pancreas was more abundant in female adenocarcinoma (0.89) followed by male adenocarcinoma (0.71), than normal female (0.64) and normal male (0.38). The expression of mRNA for EG-VEGF in normal tissue was significantly higher in females than in males. All samples examined showed specific immunostaining for EG-VEGF. In male preparations, the positive labeling was localized predominantly within the pancreatic islets while in female preparations the main staining was detected towards the exocrine portion. Specific immunolabeling was also observed in endothelial cells of pancreatic blood vessels. Our data provide evidence that the human pancreas expresses the EG-VEGF, a highly specific mitogen which regulates proliferation and differentiation of the vascular endothelium. The significance of this finding could be interpreted as either, EG-VEGF is not exclusive of endocrine organs, or the pancreas should be considered as a functional steroidogenic tissue. The extent of the expression of EG-VEGF appears to have a dimorphic pattern in normal and tumoral pancreatic tissue.  相似文献   

10.
11.
Considering the two-class classification problem in brain imaging data analysis, we propose a sparse representation-based multi-variate pattern analysis (MVPA) algorithm to localize brain activation patterns corresponding to different stimulus classes/brain states respectively. Feature selection can be modeled as a sparse representation (or sparse regression) problem. Such technique has been successfully applied to voxel selection in fMRI data analysis. However, single selection based on sparse representation or other methods is prone to obtain a subset of the most informative features rather than all. Herein, our proposed algorithm recursively eliminates informative features selected by a sparse regression method until the decoding accuracy based on the remaining features drops to a threshold close to chance level. In this way, the resultant feature set including all the identified features is expected to involve all the informative features for discrimination. According to the signs of the sparse regression weights, these selected features are separated into two sets corresponding to two stimulus classes/brain states. Next, in order to remove irrelevant/noisy features in the two selected feature sets, we perform a nonparametric permutation test at the individual subject level or the group level. In data analysis, we verified our algorithm with a toy data set and an intrinsic signal optical imaging data set. The results show that our algorithm has accurately localized two class-related patterns. As an application example, we used our algorithm on a functional magnetic resonance imaging (fMRI) data set. Two sets of informative voxels, corresponding to two semantic categories (i.e., “old people” and “young people”), respectively, are obtained in the human brain.  相似文献   

12.
MOTIVATION: Recent studies have shown that microarray gene expression data are useful for phenotype classification of many diseases. A major problem in this classification is that the number of features (genes) greatly exceeds the number of instances (tissue samples). It has been shown that selecting a small set of informative genes can lead to improved classification accuracy. Many approaches have been proposed for this gene selection problem. Most of the previous gene ranking methods typically select 50-200 top-ranked genes and these genes are often highly correlated. Our goal is to select a small set of non-redundant marker genes that are most relevant for the classification task. RESULTS: To achieve this goal, we developed a novel hybrid approach that combines gene ranking and clustering analysis. In this approach, we first applied feature filtering algorithms to select a set of top-ranked genes, and then applied hierarchical clustering on these genes to generate a dendrogram. Finally, the dendrogram was analyzed by a sweep-line algorithm and marker genes are selected by collapsing dense clusters. Empirical study using three public datasets shows that our approach is capable of selecting relatively few marker genes while offering the same or better leave-one-out cross-validation accuracy compared with approaches that use top-ranked genes directly for classification. AVAILABILITY: The HykGene software is freely available at http://www.cs.dartmouth.edu/~wyh/software.htm CONTACT: wyh@cs.dartmouth.edu SUPPLEMENTARY INFORMATION: Supplementary material is available from http://www.cs.dartmouth.edu/~wyh/hykgene/supplement/index.htm.  相似文献   

13.
Copy number variations (CNVs), a common genomic mutation associated with various diseases, are important in research and clinical applications. Whole genome amplification (WGA) and massively parallel sequencing have been applied to single cell CNVs analysis, which provides new insight for the fields of biology and medicine. However, the WGA-induced bias significantly limits sensitivity and specificity for CNVs detection. Addressing these limitations, we developed a practical bioinformatic methodology for CNVs detection at the single cell level using low coverage massively parallel sequencing. This method consists of GC correction for WGA-induced bias removal, binary segmentation algorithm for locating CNVs breakpoints, and dynamic threshold determination for final signals filtering. Afterwards, we evaluated our method with seven test samples using low coverage sequencing (4∼9.5%). Four single-cell samples from peripheral blood, whose karyotypes were confirmed by whole genome sequencing analysis, were acquired. Three other test samples derived from blastocysts whose karyotypes were confirmed by SNP-array analysis were also recruited. The detection results for CNVs of larger than 1 Mb were highly consistent with confirmed results reaching 99.63% sensitivity and 97.71% specificity at base-pair level. Our study demonstrates the potential to overcome WGA-bias and to detect CNVs (>1 Mb) at the single cell level through low coverage massively parallel sequencing. It highlights the potential for CNVs research on single cells or limited DNA samples and may prove as a promising tool for research and clinical applications, such as pre-implantation genetic diagnosis/screening, fetal nucleated red blood cells research and cancer heterogeneity analysis.  相似文献   

14.
A mixture model-based approach to the clustering of microarray expression data   总被引:13,自引:0,他引:13  
MOTIVATION: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. RESULTS: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets. AVAILABILITY: EMMIX-GENE is available at http://www.maths.uq.edu.au/~gjm/emmix-gene/  相似文献   

15.
Telomerase activity is responsible for telomere maintenance and is believed to be crucial in most immortal cells and cancer cells; however, its clinicopathological significance in gastric cancer remains to be clarified. The aim of the present study was to assess whether malignant progression of gastric adenocarcinoma correlates with telomerase activity. We also investigated the correlation between telomerase activity and histopathological findings. We examined telomerase activity in tumor specimens and adjacent normal tissues from 43 patients with gastric adenocarcinoma. Telomerase activity was measured quantitatively by the TRAPEZE Gel Based Telomerase Detection Kit. Approximately 98% of the tumor tissues were telomerase positive, but telomerase activity was detected not only in tumor tissues but also in normal gastric mucosa. Although telomerase activity was found to be higher in tumor samples than normal tissue for each subject, we could not find a general cut-off level for telomerase activity in gastric adenocarcinoma. In addition, telomerase activity was not correlated with tumor invasion, lymph node involvement and histological stage. Our results support the idea that telomerase reactivation is a common event in gastric adenocarcinoma and it is not related to histopathological parameters. Since it is difficult to set a cut-off level for this type of cancer, we suggest that the prognostic utility of telomerase assay has not yet reached the clinic in terms of predicting outcome for patients with gastric adenocarcinoma. For the assessment of gastric carcinoma, telomerase activity should be evaluated in both tumor and normal tissues, because normal gastric mucosa samples show appreciable telomerase activity.  相似文献   

16.
A benchmark for Affymetrix GeneChip expression measures   总被引:11,自引:0,他引:11  
  相似文献   

17.
We present a novel method for simultaneous genotype calling and haplotype-phase inference. Our method employs the computationally efficient BEAGLE haplotype-frequency model, which can be applied to large-scale studies with millions of markers and thousands of samples. We compare genotype calls made with our method to genotype calls made with the BIRDSEED, CHIAMO, GenCall, and ILLUMINUS genotype-calling methods, using genotype data from the Illumina 550K and Affymetrix 500K arrays. We show that our method has higher genotype-call accuracy and yields fewer uncalled genotypes than competing methods. We perform single-marker analysis of data from the Wellcome Trust Case Control Consortium bipolar disorder and type 2 diabetes studies. For bipolar disorder, the genotype calls in the original study yield 25 markers with apparent false-positive association with bipolar disorder at a p < 10−7 significance level, whereas genotype calls made with our method yield no associated markers at this significance threshold. Conversely, for markers with replicated association with type 2 diabetes, there is good concordance between genotype calls used in the original study and calls made by our method. Results from single-marker and haplotypic analysis of our method''s genotype calls for the bipolar disorder study indicate that our method is highly effective at eliminating genotyping artifacts that cause false-positive associations in genome-wide association studies. Our new genotype-calling methods are implemented in the BEAGLE and BEAGLECALL software packages.  相似文献   

18.
MOTIVATION: It is biologically interesting to address whether human blood outgrowth endothelial cells (BOECs) belong to or are closer to large vessel endothelial cells (LVECs) or microvascular endothelial cells (MVECs) based on global expression profiling. An earlier analysis using a hierarchical clustering and a small set of genes suggested that BOECs seemed to be closer to MVECs. By taking advantage of the two known classes, LVEC and MVEC, while allowing BOEC samples to belong to either of the two classes or to form their own new class, we take a semi-supervised learning approach; for high-dimensional data as encountered here, we propose a penalized mixture model with a weighted L1 penalty to realize automatic feature selection while fitting the model. RESULTS: We applied our penalized mixture model to a combined dataset containing 27 BOEC, 28 LVEC and 25 MVEC samples. Analysis results indicated that the BOEC samples appeared to form their own new class. A simulation study confirmed that, compared with the standard mixture model with or without initial variable selection, the penalized mixture model performed much better in identifying relevant genes and forming corresponding clusters. The penalized mixture model seems to be promising for high-dimensional data with the capability of novel class discovery and automatic feature selection.  相似文献   

19.
The dynamic reorganization of the actin cytoskeleton is regulated by a number of actin binding proteins (ABPs). Four human colon adenocarcinoma cell lines – parental and three selected sublines, which differ in motility and metastatic potential, were used to investigate the expression level and subcellular localization of selected ABPs. Our interest was focused on cofilin and ezrin. These proteins are essential for cell migration and adhesion. The data received for the three more motile adenocarcinoma sublines (EB3, 3LNLN, 5W) were compared with those obtained for the parental LS180 adenocarcinoma cells and fibroblastic NRK cells. Quantitative densitometric analysis and confocal fluorescence microscopy were used to examine the expression levels and subcellular distribution of the selected ABPs. Our data show distinct increase in the level of cofilin in adenocarcinoma cells accompanied by the reduction of inactive phosphorylated form of cofilin. In more motile cells, cofilin was accumulated at cellular periphery in co-localization with actin filaments. Furthemore, we indicated translocation of ezrin towards the cell periphery within more motile cells in comparison with NRK and parental adenocarcinoma cells.In summary, our data indicate the correlation between migration ability of selected human colon adenocarcinoma sublines and subcellular distribution as well as the level of cofilin and ezrin. Therefore these proteins might be essential for the higher migratory activity of invasive tumor cells.Key words: actin, cofilin, ezrin, colon adenocarcinoma.  相似文献   

20.
AimsFormation of different protrusive structures by migrating cells is driven by actin polymerization at the plasma membrane region. Gelsolin is an actin binding protein controlling the length of actin filaments by its severing and capping activity. The main goal of this study was to determine the effect of gelsolin expression on the migration of human colon adenocarcinoma LS180 and melanoma A375 cells.Main methodsColon adenocarcinoma cell line LS180 was stably transfected with plasmid containing human cytoplasmic gelsolin cDNA tagged to enhanced green fluorescence protein (EGFP). Melanoma A375 cells were transfected with siRNAs directed against gelsolin. Real-time PCR and Western blotting were used to determine the level of gelsolin. The ability of actin to inhibit DNase I activity was used to quantify monomeric and total actin level and calculate the state of actin polymerization. Fluorescence confocal microscopy was applied to observe gelsolin and vinculin distribution along with actin cytoskeleton organization.Key findingsIncreased level of gelsolin expression leads to its accumulation at the submembranous region of the cell accompanied by distinct changes in the state of actin polymerization and an increase in the migration of LS180 cells. In addition, LS180 cells overexpressing gelsolin form podosome-like structures as indicated by vinculin redistribution and its colocalization with gelsolin and actin. Downregulation of gelsolin expression in melanoma A375 cells significantly reduces their migratory potential.SignificanceOur experimental data indicate that alterations in the expression level of gelsolin and its subcellular distribution may be directly responsible for determining migration capacity of human cancer cells.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号