首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Mixture modelling of gene expression data from microarray experiments   总被引:5,自引:0,他引:5  
MOTIVATION: Hierarchical clustering is one of the major analytical tools for gene expression data from microarray experiments. A major problem in the interpretation of the output from these procedures is assessing the reliability of the clustering results. We address this issue by developing a mixture model-based approach for the analysis of microarray data. Within this framework, we present novel algorithms for clustering genes and samples. One of the byproducts of our method is a probabilistic measure for the number of true clusters in the data. RESULTS: The proposed methods are illustrated by application to microarray datasets from two cancer studies; one in which malignant melanoma is profiled (Bittner et al., Nature, 406, 536-540, 2000), and the other in which prostate cancer is profiled (Dhanasekaran et al., 2001, submitted).  相似文献   

2.
Using gene expression data to classify tumor types is a very promising tool in cancer diagnosis. Previous works show several pairs of tumor types can be successfully distinguished by their gene expression patterns (Golub et al. 1999, Ben-Dor et al. 2000, Alizadeh et al. 2000). However, the simultaneous classification across a heterogeneous set of tumor types has not been well studied yet. We obtained 190 samples from 14 tumor classes and generated a combined expression dataset containing 16063 genes for each of those samples. We performed multi-class classification by combining the outputs of binary classifiers. Three binary classifiers (k-nearest neighbors, weighted voting, and support vector machines) were applied in conjunction with three combination scenarios (one-vs-all, all-pairs, hierarchical partitioning). We achieved the best cross validation error rate of 18.75% and the best test error rate of 21.74% by using the one-vs-all support vector machine algorithm. The results demonstrate the feasibility of performing clinically useful classification from samples of multiple tumor types.  相似文献   

3.
Hemoglobin genes from the nitrogen-fixing nonlegume Parasponia andersonii and the related non-nitrogen-fixing nonlegume Trema tomentosa have been isolated [Landsmann et al. (1986). Nature 324, 166-168; Bogusz et al. (1988). Nature 331, 178-180]. The promoters of these genes have been linked to a beta-glucuronidase reporter gene and introduced into both the nonlegume Nicotiana tabacum and the legume Lotus corniculatus. Both promoters directed root-specific expression in transgenic tobacco. When transgenic Lotus plants were nodulated by Rhizobium loti, both promoter constructs showed a high level of nodule-specific expression confined to the central bacteroid-containing portion of the nodule corresponding to the expression seen for the endogenous Lotus leghemoglobin gene. The T. tomentosa promoter was also expressed at a low level in the vascular tissue of the Lotus roots. The hemoglobin promoters from both nonlegumes, including the non-nodulating species, must contain conserved cis-acting DNA signals that are responsible for nodule-specific expression in legumes. We have identified sequence motifs postulated previously as the nodule-specific regulatory elements of the soybean leghemoglobin genes [Stougaard et al. (1987). EMBO J. 6, 3565-3569].  相似文献   

4.
5.
Mutations in the notch ligand delta-like 3 have been identified in both the pudgy mouse (Dll3(pu); Kusumi et al.: Nat Genet 19:274-278, 1998) and the human disorder spondylocostal dysostosis (SCD; Bulman et al.: Nat Genet 24:438-441, 2000), and a targeted mutation has been generated (Dll3(neo); Dunwoodie et al.: Development 129:1795-1806, 2002). Vertebral and rib malformations deriving from defects in somitic patterning are key features of these disorders. In the mouse, notch pathway genes such as Lfng, Hes1, Hes7, and Hey2 display dynamic patterns of expression in paraxial mesoderm, cycling in synchrony with somite formation (Aulehla and Johnson: Dev Biol 207:49-61, 1999; Forsberg et al.: Curr Biol 8:1027-1030, 1998; Jouve et al.: Development 127:1421-1429, 2000; McGrew et al.: Curr Biol 8:979-982, 1998; Nakagawa et al.: Dev Biol 216:72-84, 1999). We report here that the Dll3(pu) mutation has different effects on the expression of cycling (Lfng and Hes7) and stage-specific genes (Hey3 and Mesp2). This suggests a more complex situation than a single oscillatory mechanism in somitogenesis and provides an explanation for the unique radiological features of the human DLL3-type of SCD.  相似文献   

6.
The fourth edition of this workshop mainly focused on three different human oncotypes, which included thyroid, urinary bladder, and prostate tumors as clinical models to gain new basic knowledge on tumor diagnosis, prognosis, and treatment. At the previous editions (Giordano et al., 2000, J Cell Physiol 183:284-287; Giordano et al., 2001, J Cell Physiol 188:274-280; Giordano et al., 2002, J Cell Physiol 191:362-365), leaders in the fields of pathology, clinical oncology, and basic research presented and discussed the most recent and prevalent findings in such neoplasms from a basic and clinical perspective. A concept that has been widely proposed is that the analysis of intrinsic biological factors displayed by primary tumors may be a valid method for diagnosing different neoplasias and for measuring both their aggressiveness and response to therapy. To date, however, no single prognostic factor, such as oncogenes, suppressor genes, or genes involved in the control of the cell cycle and/or apoptosis has yet proven to be potent enough to be used in clinical practice as a prognostic and predictive factor. The new possibility to simultaneously analyze the expression of the complete repertoire of human genes and a large number of proteins could offer a new scenario in tumor classification, allowing for the formulation of a list of genes able to define a "signature" of tumor outcome. Moreover, starting from data obtained from biomolecular tumor analyses, it has been demonstrated that with this approach, it is also possible to design future therapeutic strategies.  相似文献   

7.
MOTIVATION: The process of determining the functional sequence content of an organism is confounded by several factors. Large protein coding sequences are relatively easy to find by statistical methods. Smaller proteins however may escape detection due to their size falling below some arbitrary researcher-defined minimum cutoff, or the inability to precisely define a promoter, or translational start (Delcher et al., Nucleic Acids Res., 27, 4636-4641, 1999). Promoter and regulatory sequences themselves are difficult to define due to a significant amount of allowable sequence variation, as well as a probable lack of any completely accurate whole-organismal gene catalogs to date. Finally, certain genes coding functional RNAs may have insufficient structural or sequence constraints to be detectable by normal sequence structure/pattern searching methods (Eddy and Rivas, Bioinformatics, 16, 583-605, 2000). In those cases where there are multiple closely related organisms that have been sequenced, there is additional information that may be used in the investigation of sequence content-that being the possible conserved nature of functional sequences between the organisms. We present a method for the utilization of this conserved information to detect genes and other potentially functional sequences that may be missed by standard ORF-calling, RNA finding, and pattern matching software. The tricross programs produce a multi-way cross comparison of three sets of sequences, determine which are conserved in all three sets, and produce a graphical (Virtual Reality Modelling Language-VRML; (ISO/IEC 14772-1: 1997, VDC), 1997) representation as well as alignments of all sequence triples found. The software can also be applied to a pair of sequence sets, though the noise in the results increases. RESULTS: Tricross has been used to examine the intergenic-sequence content of the three archaeal Pyrococcus genomes to determine the most highly related sequences remaining between the annotated protein and RNA coding sequences. Set to relatively stringent similarity requirements for the search, tricross found 101 intergenic sequences conserved among the three organisms. Interestingly, 29 of these appear to contain members of a family of small RNA molecules (Kiss-Laszlo et al., EMBO J., 17, 797-807, 1998) only recently discovered in the Archaea (Armbruster, OSU, Diss., 1988; Omer et al., Science, 288, 517-522, 2000; Gaspin et al., J. Mol. Biol., 297, 895-906, 2000). While some of the remaining 72 appear to be individual highly conserved promoter sequences, others have no currently known biological significance. Although originally developed to facilitate the examination of intergenic sequences, none of the tricross logic is inherently specific to intergenic sequences. The software can also be applied to gene sequences, and has been used to produce inter-genomic gene order dot-plots for Haemophilus influenzae (Fleischmann et al., Science, 269, 496-512, 1995) versus H.ducreyi (unpublished data), and Neisseria meningiditis Z2491 (serogroup A) (Parkhill et al., Nature, 404, 502-506, 2000) versus Neisseria meningiditis Z58 (serogroup B) (Tettelin et al., Science, 287, 1809-1815, 2000) versus Neisseria gonorrhoeae (Lewis et al., http://micro-gen.ouhsc.edu/, 2000). AVAILABILITY: The tricross software package is available from http://www.biosci.ohio-state.edu/~ray/bioinformatics/tricross.html. CONTACT: ray@biosci.ohio-state.edu; daniels.7@osu.edu; munsonr@pediatrics.ohio-state.edu Supplementary information: Additional data from the cross-genomic comparisons examined in the discussion section are linked from http://www.biosci.ohio-state.edu/~ray/bioinformatics/tricross.html.  相似文献   

8.
The Spemann organizer can be subdivided into head- and trunk-inducing tissues along the anteroposterior axis (Mangold, 1933. Naturwiisenschaften 43, 761-766; Spemann, 1931. Wilhelm Roux Arch. Entwicklungsmech. Org. 123, 389-517). Recent studies have suggested that head formation is brought about by repression of both Wnt and BMP signalling (Glinka et al., 1998. Nature 391, 357-362; Glinka et al., 1997. Nature 389, 517-519). Several Wnt inhibitors secreted from the head organizer region have been identified in Xenopus, such as Cerberus (Bouwmeester et al., 1996. Nature 382, 595-601), Frzb-1 (Leyns et al., 1997. Cell 88, 747-756; Lin et al., 1997. Proc. Natl. Acad. Sci. USA 94, 11196-11200), and Dkk-1 (Glinka et al., 1998. Nature 391, 357-362), supporting this two-inhibitor model. To isolate genes expressed in the head organizer, we screened a prechordal plate cDNA library by sequencing and expression pattern, and isolated the Xenopus ortholog of chick crescent encoding a Frizzled-like domain that is related to Wnt-binding regions of the Frizzled-family proteins. Expression of Xenopus crescent was first detected in the Spemann organizer region at the early gastrula stage and later in prechordal plate cells lining the boundary of mesoderm and ectoderm layers and in the anterior endoderm. At tailbud stages, the expression in the endomesoderm region was diminished, while expression in the pronephros became detectable. In animal cap assays, crescent gene was synergistically upregulated by coexpression of Xlim1, Ldb1, and Siamois, but not by Activin treatment.  相似文献   

9.
Short interfering (si) RNAs have now been shown to inhibit gene expression in several species, including mammals (Elbashir et al.: Nature 411:494-498, 2001; Fire et al.: Nature 391:806-811, 1998). RNA inhibition in primary cells such as stem cells would facilitate rapid gene discovery in a postgenome era. While retroviruses can deliver siRNA expression cassettes for stable expression (Barton and Medzhitov: Proc Natl Acad Sci USA 99:14943-14945, 2002; Paddison et al.: Proc Natl Acad Sci USA 99:1443-1448, 2002; Rubinson et al.: Nat Genet 33:401-406, 2003), an efficient method for direct transfer of siRNA to stem cells is still lacking. Here, we established electroporation to deliver siRNA to hematopoietic progenitors. On average, at least 80% of cells take up the RNA, and these display nearly 100% knockout of marker gene expression at both the RNA and protein level. Moreover, knockdown of the hematopoietic regulator, CD45, results in 3-fold more hematopoietic colonies in a progenitor assay. These results demonstrate that transient transfection of siRNA to primary cells can have substantial functional consequences. This technology may be applicable to a variety of primary cell types.  相似文献   

10.
Connexons and cell adhesion: a romantic phase   总被引:3,自引:1,他引:2  
Recent evidence indicates, that gap junction forming proteins do not only contribute to intercellular communication (Kanno and Saffitz in Cardiovasc Pathol 10:169-177, 2001; Saez et al. in Physiol Rev 83:1359-1400, 2003), ion homeostasis and volume control (Goldberg et al. in J Biol Chem 277:36725-36730, 2002; Saez et al. in Physiol Rev 83:1359-1400, 2003). They also serve biological functions in a mechanical sense, supporting adherent connections between neighbouring cells of epithelial and non-epithelial tissues (Clair et al. in Exp Cell Res 314:1250-1265, 2008; Shaw et al. in Cell 128:547-560, 2007), where they stabilize migratory pathways in the developing central nervous system (Elias et al. in Nature 448:901-907, 2007; Malatesta et al. in Development 127:5253-5263, 2000; Noctor et al. in Nature 409:714-720, 2001; Rakic in Brain Res 33:471-476, 1971; J Comp Neurol 145:61-83 1972; Science 241:170-176, 1988), or mediate polarized movements and directionality of neural crest cells during organogenesis (Kirby and Waldo in Circ Res 77:211-215, 1995; Xu et al. in Development 133:3629-3639, 2006). Since, most data describing adhesive properties of gap junctions delt with connexin 43 (Cx43) (Beardslee et al. in Circ Res 83:629-635, 1998), we will focus our brief review on this isoform.  相似文献   

11.
MOTIVATION: High-density DNA microarray measures the activities of several thousand genes simultaneously and the gene expression profiles have been used for the cancer classification recently. This new approach promises to give better therapeutic measurements to cancer patients by diagnosing cancer types with improved accuracy. The Support Vector Machine (SVM) is one of the classification methods successfully applied to the cancer diagnosis problems. However, its optimal extension to more than two classes was not obvious, which might impose limitations in its application to multiple tumor types. We briefly introduce the Multicategory SVM, which is a recently proposed extension of the binary SVM, and apply it to multiclass cancer diagnosis problems. RESULTS: Its applicability is demonstrated on the leukemia data (Golub et al., 1999) and the small round blue cell tumors of childhood data (Khan et al., 2001). Comparable classification accuracy shown in the applications and its flexibility render the MSVM a viable alternative to other classification methods. SUPPLEMENTARY INFORMATION: http://www.stat.ohio-state.edu/~yklee/msvm.htm  相似文献   

12.
13.
14.
A large number of epidemiological and experimental studies suggest that prolonged (>100 s) weak 50-60-Hz electric and magnetic field (EMF) exposures may cause biological effects(NIEHS Working Group, NIH, 1998; Bersani, 1999). We show, however, that for typical temperature sensitivities of biochemical processes, realistic temperature variations during long exposures raise the threshold exposure by two to three orders of magnitude over a fundamental value, independent of the biophysical coupling mechanism. Temperature variations have been omitted in previous theoretical analyses of possible weak field effects, particularly stochastic resonance (Bezrukov and Vodyanoy 1997a. Nature. 385:319-321; Astumian et al., 1997 Nature. 338:632-633; Bezrukov and Vodyanoy, 1997b. Nature. 338:663; Dykman and McClintock, 1998. Nature. 391:344; McClintock, 1998;. Gammaitoni et al., 1998. Rev. Mod. Phys. 70:223-287). Although sensory systems usually respond to much shorter (approximately 1 s) exposures and can approach fundamental limits (Bialek, 1987 Annu. Rev. Biophys. Biophys. Chem. 16:455-468; Adair et al, 1998. Chaos. 8:576-587), our results significantly decrease the plausibility of effects for nonsensory biological systems due to prolonged, weak-field exposures.  相似文献   

15.
In 2004, a study from our lab published in the journal Nature re-ignited a worldwide debate over the validity of the dogma that mammalian females are incapable of oocyte and follicle production during postnatal life. Amidst widespread skepticism, we forged ahead and published a second study in 2005 in the journal Cell, which not only reaffirmed with different experimental approaches that this dogma is invalid but also identified cells in bone marrow (BM) and blood of adult female mice that could generate oocytes contained within immature follicles in the ovaries of recipient females following transplantation. Although this work has been the subject of extensive critical commentary as well, two recent reports from others have confirmed the germline potential of adult BM-derived cells in mice. Further, independent corroboration of the results and conclusions presented in our earlier Nature paper is also now available. However, two papers have been published that purportedly question our work and conclusions. The first is a paper by Eggan et al. published in the journal Nature, which attempts to draw conclusions about the germline potential of BM-derived cells after focusing solely on ovulated eggs while ignoring what may be occurring at the level of oogenesis in the ovaries. The second is a report from Liu et al. just released in the journal Developmental Biology that claims to provide evidence refuting the possibility that adult female mammals produce new oocytes. However, all of the data presented in this latter report are derived from gene expression studies that the authors say fail to show the occurrence of meiosis or germ cell mitosis in adult human ovaries. Given that more than three years have past since our initial study challenging the dogma was published, it is our belief that continuing arguments against the possibility of postnatal oogenesis in mammals should be based on more rigorous experimental approaches than simply an absence of evidence, especially from gene expression analyses. Further, the interpretations offered by Liu et al. of their results are not as straightforward as they contend since some of their data can also be viewed as supportive of postnatal oogenesis in reproductive age women.  相似文献   

16.
The monophyly of Rodentia has repeatedly been challenged based on several studies of molecular sequence data. Most recently, D'Erchia et al. (1996) analyzed complete mtDNA sequences of 16 mammals and concluded that rodents are not monophyletic. We have reanalyzed these data using maximum-likelihood methods. We use two methods to test for significance of differences among alternative topologies and show that (1) models that incorporate variation in evolutionary rates across sites fit the data dramatically better than models used in the original analyses, (2) the mtDNA data fail to refute rodent monophyly, and (3) the original interpretation of strong support for nonmonophyly results from systematic error associated with an oversimplified model of sequence evolution. These analyses illustrate the importance of incorporating recent theoretical advances into molecular phylogenetic analyses, especially when results of these analyses conflict with classical hypotheses of relationships.  相似文献   

17.
Linear regression and two-class classification with gene expression data   总被引:3,自引:0,他引:3  
MOTIVATION: Using gene expression data to classify (or predict) tumor types has received much research attention recently. Due to some special features of gene expression data, several new methods have been proposed, including the weighted voting scheme of Golub et al., the compound covariate method of Hedenfalk et al. (originally proposed by Tukey), and the shrunken centroids method of Tibshirani et al. These methods look different and are more or less ad hoc. RESULTS: We point out a close connection of the three methods with a linear regression model. Casting the classification problem in the general framework of linear regression naturally leads to new alternatives, such as partial least squares (PLS) methods and penalized PLS (PPLS) methods. Using two real data sets, we show the competitive performance of our new methods when compared with the other three methods.  相似文献   

18.
Template-directed replication is known to obey a parabolic growth law due to product inhibition (Sievers & Von Kiedrowski 1994 Nature 369, 221; Lee et al. 1996 Nature 382, 525; Varga & Szathmáry 1997 Bull. Math. Biol. 59, 1145). We investigate a template-directed replication with a coupled template catalysed lipid aggregate production as a model of a minimal protocell and show analytically that the autocatalytic template-container feedback ensures balanced exponential replication kinetics; both the genes and the container grow exponentially with the same exponent. The parabolic gene replication does not limit the protocellular growth, and a detailed stoichiometric control of the individual protocell components is not necessary to ensure a balanced gene-container growth as conjectured by various authors (Gánti 2004 Chemoton theory). Our analysis also suggests that the exponential growth of most modern biological systems emerges from the inherent spatial quality of the container replication process as we show analytically how the internal gene and metabolic kinetics determine the cell population's generation time and not the growth law (Burdett & Kirkwood 1983 J. Theor. Biol. 103, 11-20; Novak et al. 1998 Biophys. Chem. 72, 185-200; Tyson et al. 2003 Curr. Opin. Cell Biol. 15, 221-231). Previous extensive replication reaction kinetic studies have mainly focused on template replication and have not included a coupling to metabolic container dynamics (Stadler et al. 2000 Bull. Math. Biol. 62, 1061-1086; Stadler & Stadler 2003 Adv. Comp. Syst. 6, 47). The reported results extend these investigations. Finally, the coordinated exponential gene-container growth law stemming from catalysis is an encouraging circumstance for the many experimental groups currently engaged in assembling self-replicating minimal artificial cells (Szostak 2001 et al. Nature 409, 387-390; Pohorille & Deamer 2002 Trends Biotech. 20 123-128; Rasmussen et al. 2004 Science 303, 963-965; Szathma ry 2005 Nature 433, 469-470; Luisi et al. 2006 Naturwissenschaften 93, 1-13).  相似文献   

19.
20.
Tissue classification with gene expression profiles.   总被引:29,自引:0,他引:29  
Constantly improving gene expression profiling technologies are expected to provide understanding and insight into cancer-related cellular processes. Gene expression data is also expected to significantly aid in the development of efficient cancer diagnosis and classification platforms. In this work we examine three sets of gene expression data measured across sets of tumor(s) and normal clinical samples: The first set consists of 2,000 genes, measured in 62 epithelial colon samples (Alon et al., 1999). The second consists of approximately equal to 100,000 clones, measured in 32 ovarian samples (unpublished extension of data set described in Schummer et al. (1999)). The third set consists of approximately equal to 7,100 genes, measured in 72 bone marrow and peripheral blood samples (Golub et al, 1999). We examine the use of scoring methods, measuring separation of tissue type (e.g., tumors from normals) using individual gene expression levels. These are then coupled with high-dimensional classification methods to assess the classification power of complete expression profiles. We present results of performing leave-one-out cross validation (LOOCV) experiments on the three data sets, employing nearest neighbor classifier, SVM (Cortes and Vapnik, 1995), AdaBoost (Freund and Schapire, 1997) and a novel clustering-based classification technique. As tumor samples can differ from normal samples in their cell-type composition, we also perform LOOCV experiments using appropriately modified sets of genes, attempting to eliminate the resulting bias. We demonstrate success rate of at least 90% in tumor versus normal classification, using sets of selected genes, with, as well as without, cellular-contamination-related members. These results are insensitive to the exact selection mechanism, over a certain range.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号