首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Quantitative PCR (qPCR) is a powerful tool for measuring gene expression levels. Accurate and reproducible results are dependent on the correct choice of reference genes for data normalization. Atropa belladonna is a commercial plant species from which pharmaceutical tropane alkaloids are extracted. In this study, eight candidate reference genes, namely 18S ribosomal RNA (18S), actin (ACT), cyclophilin (CYC), elongation factor 1α (EF-1α), β-fructosidase (FRU), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), phosphoglycerate kinase (PGK), and beta-tubulin (TUB), were selected and their expression stabilities studied to determine their suitability for normalizing gene expression in A. belladonna. The expression stabilities of these genes were analyzed in the root, stem, and leaf under cold, heat, NaCl, UV-B, methyl jasmonate, salicylic acid, and abscisic acid treatments using geNorm, NormFinder, and BestKeeper. The statistical algorithms indicated that PGK was a reliable gene for normalizing gene expression under most of the experimental conditions. The pairwise value analysis showed that two genes were sufficient for proper expression normalization, except when analyzing gene expression in heat-treated roots. However, the choice of the second reference gene depended on specific conditions. Finally, the relative expression level of the PMT gene of A. belladonna was detected to validate the selection of PGK a reliable reference gene. In summary, our results should guide the selection of appropriate reference genes for gene expression studies in A. belladonna under different organs and abiotic stress conditions.  相似文献   

3.
4.
结合基因功能分类体系Gene Ontology筛选聚类特征基因   总被引:3,自引:0,他引:3  
使用两套基因表达谱数据,按各基因的表达值方差,选择表达变异基因对样本聚类,发现一般使用方差较大的前10%的基因作为特征基因,就可以较好地对疾病样本聚类。对不同的疾病,包含聚类信息的特征基因有不同的分布特点。在此基础上,结合基因功能分类体系(Gene Ontology,GO),进一步筛选聚类的特征基因。通过检验在Gene Ontology中的每个功能类中的表达变异基因是否非随机地聚集,寻找疾病相关功能类,再根据相关功能类中的表达变异基因进行聚类分析。实验结果显示:结合基因功能体系进一步筛选表达变异基因作为聚类特征基因,可以保持或提高聚类准确性,并使得聚类结果具有明确的生物学意义。另外,发现了一些可能和淋巴瘤和白血病相关的基因。  相似文献   

5.
Microarray data has a high dimension of variables but available datasets usually have only a small number of samples, thereby making the study of such datasets interesting and challenging. In the task of analyzing microarray data for the purpose of, e.g., predicting gene-disease association, feature selection is very important because it provides a way to handle the high dimensionality by exploiting information redundancy induced by associations among genetic markers. Judicious feature selection in microarray data analysis can result in significant reduction of cost while maintaining or improving the classification or prediction accuracy of learning machines that are employed to sort out the datasets. In this paper, we propose a gene selection method called Recursive Feature Addition (RFA), which combines supervised learning and statistical similarity measures. We compare our method with the following gene selection methods:
  • Support Vector Machine Recursive Feature Elimination (SVMRFE)
  • Leave-One-Out Calculation Sequential Forward Selection (LOOCSFS)
  • Gradient based Leave-one-out Gene Selection (GLGS)
To evaluate the performance of these gene selection methods, we employ several popular learning classifiers on the MicroArray Quality Control phase II on predictive modeling (MAQC-II) breast cancer dataset and the MAQC-II multiple myeloma dataset. Experimental results show that gene selection is strictly paired with learning classifier. Overall, our approach outperforms other compared methods. The biological functional analysis based on the MAQC-II breast cancer dataset convinced us to apply our method for phenotype prediction. Additionally, learning classifiers also play important roles in the classification of microarray data and our experimental results indicate that the Nearest Mean Scale Classifier (NMSC) is a good choice due to its prediction reliability and its stability across the three performance measurements: Testing accuracy, MCC values, and AUC errors.  相似文献   

6.
两种过滤特征基因选择算法的有效性研究   总被引:2,自引:0,他引:2  
李丽  李霞  郭政  汪强虎  王海芸 《生命科学研究》2003,7(4):369-373,376
对基因表达谱进行特征基因选择不仅能改善疾病分类方法的效能,而且为寻找与疾病相关的特征基因提供新的途径.通过比较用调整p值的t检验、非参数评分两种特征基因选择算法后和未进行选择时支持向量机(SVM)分类器的分类性能、支持向量(SV)的吻合度、错分样本ID的吻合度和对样本均匀翻倍后的稳定性.结果发现:特征选择后线性、核函数为二阶多项式和径向基的SVM分类性能明显提高;特征选择前后的SV及错分样本ID的吻合度均较高;SVM的稳定性较好.由此得出结论:这两种特征选择算法具有一定的有效性.  相似文献   

7.
A major challenge in biomedical studies in recent years has been the classification of gene expression profiles into categories, such as cases and controls. This is done by first training a classifier by using a labeled training set containing labeled samples from the two populations, and then using that classifier to predict the labels of new samples. Such predictions have recently been shown to improve the diagnosis and treatment selection practices for several diseases. This procedure is complicated, however, by the high dimensionality if the data. While microarrays can measure the levels of thousands of genes per sample, case-control microarray studies usually involve no more than several dozen samples. Standard classifiers do not work well in these situations where the number of features (gene expression levels measured in these microarrays) far exceeds the number of samples. Selecting only the features that are most relevant for discriminating between the two categories can help construct better classifiers, in terms of both accuracy and efficiency. In this work we developed a novel method for multivariate feature selection based on the Partial Least Squares algorithm. We compared the method''s variants with common feature selection techniques across a large number of real case-control datasets, using several classifiers. We demonstrate the advantages of the method and the preferable combinations of classifier and feature selection technique.  相似文献   

8.
基因表达研究中内参基因的选择与应用   总被引:4,自引:0,他引:4  
管家基因是一类无组织特异性的,在物种的所有组织细胞中都表达的基因,被广泛用作内参基因来检测目标基因在不同的组织器官、一定的发育阶段或胁迫的环境条件下的表达规律变化。这些管家基因并不是在所有生理条件下都能作为理想内参基因稳定表达。在基因表达转录分析中,大多数普遍使用的内参基因已不能满足准确定量的要求。基于统计学分析软件,如geNorm、BestKeeper和NormFinder三种分析软件,可以筛选出稳定性较好的内参基因。本文综述了内参基因的选择条件、方法及应用。  相似文献   

9.

Background

Among the primary goals of microarray analysis is the identification of genes that could distinguish between different phenotypes (feature selection). Previous studies indicate that incorporating prior information of the genes'' function could help identify physiologically relevant features. However, current methods that incorporate prior functional information do not provide a relative estimate of the effect of different genes on the biological processes of interest.

Results

Here, we present a method that integrates gene ontology (GO) information and expression data using Bayesian regression mixture models to perform unsupervised clustering of the samples and identify physiologically relevant discriminating features. As a model application, the method was applied to identify the genes that play a role in the cytotoxic responses of human hepatoblastoma cell line (HepG2) to saturated fatty acid (SFA) and tumor necrosis factor (TNF)-α, as compared to the non-toxic response to the unsaturated FFAs (UFA) and TNF-α. Incorporation of prior knowledge led to a better discrimination of the toxic phenotypes from the others. The model identified roles of lysosomal ATPases and adenylate cyclase (AC9) in the toxicity of palmitate. To validate the role of AC in palmitate-treated cells, we measured the intracellular levels of cyclic AMP (cAMP). The cAMP levels were found to be significantly reduced by palmitate treatment and not by the other FFAs, in accordance with the model selection of AC9.

Conclusions

A framework is presented that incorporates prior ontology information, which helped to (a) perform unsupervised clustering of the phenotypes, and (b) identify the genes relevant to each cluster of phenotypes. We demonstrate the proposed framework by applying it to identify physiologically-relevant feature genes that conferred differential toxicity to saturated vs. unsaturated FFAs. The framework can be applied to other problems to efficiently integrate ontology information and expression data in order to identify feature genes.  相似文献   

10.
华琳  郑卫英  刘红  林慧  高磊 《生物工程学报》2008,24(9):1643-1648
利用随机森林-通路分析法,通过袋外样本OOB的分类错误率筛选特征代谢通路,在特征通路上作基因表达相关性研究并对通路上的基因采用MAP(Mining attribute profile)算法挖掘不同实验条件下基因的共调控表达模式,对共调控表达模式进行聚类.分析结果显示同一特征代谢通路上的基因表达倾向相似,有2条特征代谢通路存在共表达模式.其中一条通路含108个表达模式,对这些模式进行聚类,其最低聚类的相似系数仍高达0.623.说明同一特征代谢通路上的基因共表达模式在不同实验条件下仍具有高度的相似性.对以通路作为基因模块进行复杂疾病的研究具有借鉴意义.  相似文献   

11.
12.
In both prokaryotic and eukaryotic cells, gene expression is regulated across the cell cycle to ensure “just-in-time” assembly of select cellular structures and molecular machines. However, present in all time-series gene expression measurements is variability that arises from both systematic error in the cell synchrony process and variance in the timing of cell division at the level of the single cell. Thus, gene or protein expression data collected from a population of synchronized cells is an inaccurate measure of what occurs in the average single-cell across a cell cycle. Here, we present a general computational method to extract “single-cell”-like information from population-level time-series expression data. This method removes the effects of 1) variance in growth rate and 2) variance in the physiological and developmental state of the cell. Moreover, this method represents an advance in the deconvolution of molecular expression data in its flexibility, minimal assumptions, and the use of a cross-validation analysis to determine the appropriate level of regularization. Applying our deconvolution algorithm to cell cycle gene expression data from the dimorphic bacterium Caulobacter crescentus, we recovered critical features of cell cycle regulation in essential genes, including ctrA and ftsZ, that were obscured in population-based measurements. In doing so, we highlight the problem with using population data alone to decipher cellular regulatory mechanisms and demonstrate how our deconvolution algorithm can be applied to produce a more realistic picture of temporal regulation in a cell.  相似文献   

13.
14.
15.
16.
王蕊平  王年  苏亮亮  陈乐 《生物信息学》2011,9(2):164-166,170
海量数据的存在是现代信息社会的一大特点,如何在成千上万的基因中有效地选出样本的分类特征对癌症的诊治具有重要意义。采用局部非负矩阵分解方法对癌症基因表达谱数据进行特征提取。首先对基因表达谱数据进行筛选,然后构造局部非负矩阵并对其进行分解得到维数低、能充分表征样本的特征向量,最后用支持向量机对特征向量进行分类。结果表明该方法的可行性和有效性。  相似文献   

17.
The prediction of the secondary structure of a protein from its amino acid sequence is an important step towards the prediction of its three-dimensional structure. However, the accuracy of ab initio secondary structure prediction from sequence is about 80 % currently, which is still far from satisfactory. In this study, we proposed a novel method that uses binomial distribution to optimize tetrapeptide structural words and increment of diversity with quadratic discriminant to perform prediction for protein three-state secondary structure. A benchmark dataset including 2,640 proteins with sequence identity of less than 25 % was used to train and test the proposed method. The results indicate that overall accuracy of 87.8 % was achieved in secondary structure prediction by using ten-fold cross-validation. Moreover, the accuracy of predicted secondary structures ranges from 84 to 89 % at the level of residue. These results suggest that the feature selection technique can detect the optimized tetrapeptide structural words which affect the accuracy of predicted secondary structures.  相似文献   

18.
基因表达谱微阵列数据库是一类可提供存储、查询、下载分析的在线网络数据库,在肿瘤相关领域的研究中提供了大量的数据来源。由于微阵列分析对于无生物/医学信息学专业背景的研究人员仍然有较多困难,致使该数据库的使用尚未普及。本文从数据查询、下载分析和使用方法等方面对常用基因表达谱微阵列数据库进行概述,并对现阶段基因表达微阵列数据库的应用策略进行总结,旨在帮助该领域研究的初学工作者了解数据库的基本知识并推动其在科研工作中的应用。  相似文献   

19.
20.
Quantitative real-time RT-PCR (qPCR) has been widely used to investigate gene expression during seed germination, a process involving seed transition from dry/physiologically inactive to hydrated/active state. This transition may result in altered expression of many housekeeping genes (HKGs), conventionally used as internal controls, thereby posing a challenge about selection of HKGs in such scenarios. The objectives of this study included identifying valid reference genes for seed priming and germination studies, both of which involve the transition of seed hydration status, and assessing whether or not findings derived from the “seed model” used in this study would also be applicable to other plant species. Eight commonly used HKGs were evaluated in maize seeds during hydropriming and germination. Using Bestkeeper, geNorm, and NormFinder, we provided a rank of stability for these HKGs. Actdf, UBQ, βtub, 18S, Act, and GAPDH were adjudged as valid internal controls by geNorm and NormFinder. Under the second objective, we conducted a case study with spinach seeds collected during osmopriming and germination. Our results indicate that the conclusions derived from maize were applicable to spinach as well, in that 18S exhibited greater expression stability than GAPDH in osmoprimed and germinated seeds; this held true even under stress conditions. While both of these genes were rejected by BestKeeper, we found that 18S exhibited stable expression when “dry” and “hydrated” seeds were analyzed as separate data sets. Although this approach precludes the comparison between “hydrated” and “dry” seeds, it still provides effective comparison among samples of same hydration status.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号