首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 10 毫秒
1.
MAGEST is a database for newly identified maternal cDNAs of the ascidian, Halocynthia roretzi, which aims to examine the population of the mRNAs. We have collected 3' and 5' tag sequences of mRNAs and their expression data from whole-mount in situ hybridi-zation in early embryos. To date, we have determined more than 2000 tag-sequences of H.roretzi cDNAs and input them into public databases. The tag sequences and the expression data as well as additional information can be obtained through MAGEST via the WWW at http://www.genome.ad.jp/magest/  相似文献   

2.
Minute tissue samples or single cells increasingly provide the starting material for gene expression profiling, which often requires RNA amplification. Although much effort has been put into optimizing amplification protocols, the relative abundance of RNA templates in the amplified product is frequently biased. We applied a T7 polymerase-based technique to amplify RNA from two tissues of a cichlid fish and compared expression levels of unamplified and amplified RNA on a cDNA microarray. Amplification bias was generally minor and comprised features that were lost (1.3%) or gained (2.5%) through amplification and features that were scored as regulated before but unregulated after amplification (4.2%) or vice versa (19.5%). We examined 10 sequence-specific properties and found that GC content, folding energy, hairpin length and number, and lengths of poly(A) and poly(T) stretches significantly affected RNA amplification. We conclude that, if RNA amplification is used in gene expression studies, preceding experiments controlling for amplification bias should be performed.  相似文献   

3.

Background

Cancer is a heterogeneous disease caused by genomic aberrations and characterized by significant variability in clinical outcomes and response to therapies. Several subtypes of common cancers have been identified based on alterations of individual cancer genes, such as HER2, EGFR, and others. However, cancer is a complex disease driven by the interaction of multiple genes, so the copy number status of individual genes is not sufficient to define cancer subtypes and predict responses to treatments. A classification based on genome-wide copy number patterns would be better suited for this purpose.

Method

To develop a more comprehensive cancer taxonomy based on genome-wide patterns of copy number abnormalities, we designed an unsupervised classification algorithm that identifies genomic subgroups of tumors. This algorithm is based on a modified genomic Non-negative Matrix Factorization (gNMF) algorithm and includes several additional components, namely a pilot hierarchical clustering procedure to determine the number of clusters, a multiple random initiation scheme, a new stop criterion for the core gNMF, as well as a 10-fold cross-validation stability test for quality assessment.

Result

We applied our algorithm to identify genomic subgroups of three major cancer types: non-small cell lung carcinoma (NSCLC), colorectal cancer (CRC), and malignant melanoma. High-density SNP array datasets for patient tumors and established cell lines were used to define genomic subclasses of the diseases and identify cell lines representative of each genomic subtype. The algorithm was compared with several traditional clustering methods and showed improved performance. To validate our genomic taxonomy of NSCLC, we correlated the genomic classification with disease outcomes. Overall survival time and time to recurrence were shown to differ significantly between the genomic subtypes.

Conclusions

We developed an algorithm for cancer classification based on genome-wide patterns of copy number aberrations and demonstrated its superiority to existing clustering methods. The algorithm was applied to define genomic subgroups of three cancer types and identify cell lines representative of these subgroups. Our data enabled the assembly of representative cell line panels for testing drug candidates.  相似文献   

4.
5.
6.
7.
8.

Background  

Multiple sequence alignment is the foundation of many important applications in bioinformatics that aim at detecting functionally important regions, predicting protein structures, building phylogenetic trees etc. Although the automatic construction of a multiple sequence alignment for a set of remotely related sequences cause a very challenging and error-prone task, many downstream analyses still rely heavily on the accuracy of the alignments.  相似文献   

9.
Gene expression profiling on microarrays is widely used to measure the expression of large numbers of genes in a single experiment. Because of the high cost of this method, feasible numbers of replicates are limited, thus impairing the power of statistical analysis. As a step toward reducing technically induced variation, we developed a procedure of sample preparation and analysis that minimizes the number of sample manipulation steps, introduces quality control before array hybridization, and allows recovery of the prepared mRNA for independent validation of results. Sample preparation is based on mRNA separation using oligo(dT) magnetic beads, which are subsequently used for first-strand cDNA synthesis on the beads. cDNA covalently bound to the magnetic beads is used as template for second-strand cDNA synthesis, leaving the intact mRNA in solution for further analysis. The quality of the synthesized cDNA can be assessed by quantitative polymerase chain reaction using 3'- and 5'-specific primer pairs for housekeeping genes such as glyceraldehyde-3-phosphate dehydrogenase. Second-strand cDNA is chemically labeled with fluorescent dyes to avoid dye bias in enzymatic labeling reactions. After hybridization of two differently labeled samples to microarray slides, arrays are scanned and images analyzed automatically with high reproducibility. Quantile-normalized data from five biological replica display a coefficient of variation 45% for 90% of profiled genes, allowing detection of twofold changes with false positive and false negative rates of 10% each. We demonstrate successful application of the procedure for expression profiling in plant leaf tissue. However, the method could be easily adapted for samples from animal including human or from microbial origin.  相似文献   

10.
Gene expression profiling using microarrays has been limited to comparisons of gene expression between small numbers of samples within individual experiments. However, the unknown and variable sensitivities of each probeset have rendered the absolute expression of any given gene nearly impossible to estimate. We have overcome this limitation by using a very large number (>10,000) of varied microarray data as a common reference, so that statistical attributes of each probeset, such as the dynamic range and threshold between low and high expression, can be reliably discovered through meta-analysis. This strategy is implemented in a web-based platform named "Gene Expression Commons" (https://gexc.stanford.edu/) which contains data of 39 distinct highly purified mouse hematopoietic stem/progenitor/differentiated cell populations covering almost the entire hematopoietic system. Since the Gene Expression Commons is designed as an open platform, investigators can explore the expression level of any gene, search by expression patterns of interest, submit their own microarray data, and design their own working models representing biological relationship among samples.  相似文献   

11.
VizStruct: exploratory visualization for gene expression profiling   总被引:2,自引:0,他引:2  
MOTIVATION: DNA arrays provide a broad snapshot of the state of the cell by measuring the expression levels of thousands of genes simultaneously. Visualization techniques can enable the exploration and detection of patterns and relationships in a complex data set by presenting the data in a graphical format in which the key characteristics become more apparent. The dimensionality and size of array data sets however present significant challenges to visualization. The purpose of this study is to present an interactive approach for visualizing variations in gene expression profiles and to assess its usefulness for classifying samples. RESULTS: The first Fourier harmonic projection was used to map multi-dimensional gene expression data to two dimensions in an implementation called VizStruct. The visualization method was tested using the differentially expressed genes identified in eight separate gene expression data sets. The samples were classified using the oblique decision tree (OC1) algorithm to provide a procedure for visualization-driven classification. The classifiers were evaluated by the holdout and the cross-validation techniques. The proposed method was found to achieve high accuracy. AVAILABILITY: Detailed mathematical derivation of all mapping properties as well as figures in color can be found as supplementary on the web page http://www.cse.buffalo.edu/DBGROUP/bioinformatics/supplementary/vizstruct. All programs were written in Java and Matlab and software code is available by request from the first author.  相似文献   

12.
Lotus japonicus has received increased attention as a potential model legume plant. In order to study gene expression in reproductive organs and to identify genes that play a crucial function in sexual reproduction, we constructed a cDNA library from immature flower buds containing anthers at the stage of developing tapetum cells in L. japonicus, and characterized 919 expressed sequence tags (ESTs) randomly selected from a cDNA library of the immature flower buds. The 919 ESTs analyzed were clustered into 821 non-redundant EST groups. As a result of a database search, 436 groups (53%) out of the 821 groups showed sequence similarity to genes registered in the public database. Out of these 436 groups, 109 groups showed similarity to genes encoding hypothetical proteins whose function had not yet been estimated. Three hundred eighty five groups (47%) showed no significant homology to known sequences and were classified as novel sequences. A comparison of 821 non-redundant EST sequences and EST sequences derived from the whole plant L. japonicus revealed that 474 EST sequences derived from immature flower buds were not found in the EST sequences of the whole plant. In order to confirm the expression pattern of potential reproductive-organ specific EST clones, nine clones, which were not matched to ESTs derived from the whole plant, were selected, and RT-PCR analysis was performed on these clones. As a result of RT-PCR, we found two novel anther specific clones. One clone was homologous to a gene encoding human cleft lip and palate associated transmembrane protein (CLPTM1) like protein, and the other clone did not show a significant similarity to any genes deposited in the public database. These results indicate that ESTs analyzed here represent a valuable resource for finding reproductive-organ specific genes in Lotus japonicus.  相似文献   

13.
14.
15.
16.
17.
18.
19.
20.
The availability of sequenced genomes has generated a need for experimental approaches that allow the simultaneous analysis of large, or even complete, sets of genes. To facilitate such analyses, we have developed GST-PRIME, a software package for retrieving and assembling gene sequences, even from complex genomes, using the NCBI public database, and then designing sets of primer pairs for use in gene amplification. Primers were designed by the program for the direct amplification of gene sequence tags (GSTs) from either genomic DNA or cDNA. Test runs of GST-PRIME on 2000 randomly selected Arabidopsis and Drosophila genes demonstrate that 93 and 88% of resulting GSTs, respectively, fulfilled imposed length criteria. GST-PRIME primer pairs were tested on a set of 1900 Arabidopsis genes coding for chloroplast-targeted proteins: 95% of the primer pairs used in PCRs with genomic DNA generated the correct amplicons. GST-PRIME can thus be reliably used for large-scale or specific amplification of intron-containing genes of multicellular eukaryotes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号